/ / kafka.cluster.BrokerEndPoint nemôže byť odovzdaný do súboru kafka.cluster.Broker - Scala, apache-spark, apache-kafka

kafka.cluster.BrokerEndPoint nie je možné odovzdať do kafka.cluster.Broker - scala, apache-spark, apache-kafka

Používam kafka2.11-0.11.0.1, scala 2.11 a iskru 2.2.0. Pridal som nasledujúce poháre na zatmenie java build cesta:

kafka-streams-0.11.0.1,
kafka-tools-0.11.0.1,
spark-streaming_2.11-2.2.0,
spark-streaming-kafka_2.11-1.6.3,
spark-streaming-kafka-0-10_2.11-2.2.0,
kafka_2.11-0.11.0.1.

A môj kód je uvedený nižšie:

import kafka.serializer.StringDecoder
import kafka.api._
import kafka.api.ApiUtils._
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.dstream._
import org.apache.spark.streaming.kafka
import org.apache.spark.streaming.kafka._
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.storage.StorageLevel
import org.apache.spark.SparkContext._


object KafkaExample {

def main(args: Array[String]) {

val ssc = new StreamingContext("local[*]", "KafkaExample", Seconds(1))

val kafkaParams = Map("bootstrap.servers" -> "kafkaIP:9092")

val topics = List("logstash_log").toSet

val stream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc,kafkaParams,topics).map(_._2)

stream.print()

ssc.checkpoint("C:/checkpoint/")
ssc.start()
ssc.awaitTermination()
}
}

Je to veľmi jednoduchý kód iba na pripojenie iskry a kafky. Táto chyba sa však zobrazuje:

Exception in thread "main" java.lang.ClassCastException: kafka.cluster.BrokerEndPoint cannot be cast to kafka.cluster.Broker
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6$$anonfun$apply$7.apply(KafkaCluster.scala:90)
at scala.Option.map(Option.scala:146)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6.apply(KafkaCluster.scala:90)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6.apply(KafkaCluster.scala:87)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3.apply(KafkaCluster.scala:87)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3.apply(KafkaCluster.scala:86)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:94)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2.apply(KafkaCluster.scala:86)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2.apply(KafkaCluster.scala:85)
at scala.util.Either$RightProjection.flatMap(Either.scala:522)
at org.apache.spark.streaming.kafka.KafkaCluster.findLeaders(KafkaCluster.scala:85)
at org.apache.spark.streaming.kafka.KafkaCluster.getLeaderOffsets(KafkaCluster.scala:179)
at org.apache.spark.streaming.kafka.KafkaCluster.getLeaderOffsets(KafkaCluster.scala:161)
at org.apache.spark.streaming.kafka.KafkaCluster.getLatestLeaderOffsets(KafkaCluster.scala:150)
at org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$5.apply(KafkaUtils.scala:215)
at org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$5.apply(KafkaUtils.scala:211)
at scala.util.Either$RightProjection.flatMap(Either.scala:522)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at com.defne.KafkaExample$.main(KafkaExample.scala:28)
at com.defne.KafkaExample.main(KafkaExample.scala)

Kde sa mám zle?

POZNÁMKA: Skúsil som „metadata.broker.list“ namiesto „bootstrap.server“, ale bez zmeny.

odpovede:

0 pre odpoveď č. 1

Váš problém je, že máte príliš veľa závislostí od spoločnosti Kafka a tie, ktoré boli vyzdvihnuté za behu, nie sú kompatibilné s verziou, ktorú Spark očakáva.

tvoj skutočné Problém je PartitionMetadata trieda. V 0.8.2 to vyzerá takto (z čoho získate) spark-streaming-kafka_2.11-1.6.3):

case class PartitionMetadata(partitionId: Int,
val leader: Option[Broker],
replicas: Seq[Broker],
isr: Seq[Broker] = Seq.empty,
errorCode: Short = ErrorMapping.NoError) extends Logging

A v> 0,10.0.0 takto:

case class PartitionMetadata(partitionId: Int,
leader: Option[BrokerEndPoint],
replicas: Seq[BrokerEndPoint],
isr: Seq[BrokerEndPoint] = Seq.empty,
errorCode: Short = Errors.NONE.code) extends Logging

Vidieť ako leader zmenené z Option[Broker] na Option[BrokerEndPoint]? Na to kričí Spark.

Musíte vyčistiť svoje závislosti, všetko, čo potrebujete (ak používate Spark 2.2) je:

spark-streaming_2.11-2.2.0,
spark-streaming-kafka-0-10_2.11-2.2.0