More information about compilation and usage, please visit Spark Doris Connector
You need to copy customer_env.sh.tpl to customer_env.sh before build and you need to configure it before build.
git clone git@github.com:apache/doris-spark-connector.git cd doris-spark-connector/spark-doris-connector ./build.sh
$ docker pull apache/doris:build-env-ldb-toolchain-latest
the result of compile jar is like:spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar
download spark for https://spark.apache.org/downloads.html .if in china there have a good choice of tencent link https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/
#download wget https://mirrors.cloud.tencent.com/apache/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz #decompression tar -xzvf spark-3.1.2-bin-hadoop3.2.tgz
vim /etc/profile export SPARK_HOME=/your_parh/spark-3.1.2-bin-hadoop3.2 export PATH=$PATH:$SPARK_HOME/bin source /etc/profile
cp /your_path/spark-doris-connector/target/spark-doris-connector-3.1_2.12-1.0.0-SNAPSHOT.jar $SPARK_HOME/jars
created doris database and table。
create database mongo_doris; use mongo_doris; CREATE TABLE data_sync_test_simple ( _id VARCHAR(32) DEFAULT '', id VARCHAR(32) DEFAULT '', user_name VARCHAR(32) DEFAULT '', member_list VARCHAR(32) DEFAULT '' ) DUPLICATE KEY(_id) DISTRIBUTED BY HASH(_id) BUCKETS 10 PROPERTIES("replication_num" = "1"); INSERT INTO data_sync_test_simple VALUES ('1','1','alex','123');
import org.apache.doris.spark._ val dorisSparkRDD = sc.dorisRDD( tableIdentifier = Some("mongo_doris.data_sync_test"), cfg = Some(Map( "doris.fenodes" -> "127.0.0.1:8030", "doris.request.auth.user" -> "root", "doris.request.auth.password" -> "" )) ) dorisSparkRDD.collect()
spark.yarn.jars=hdfs:///spark-jars/doris-spark-connector-3.1.2-2.12-1.0.0.jar
Link:apache/doris#9486
dorisSparkDF = spark.read.format("doris") .option("doris.table.identifier", "mongo_doris.data_sync_test") .option("doris.fenodes", "127.0.0.1:8030") .option("user", "root") .option("password", "") .load() # show 5 lines data dorisSparkDF.show(5)type convertion for writing to doris using arrow doris spark BOOLEAN BooleanType TINYINT ByteType SMALLINT ShortType INT IntegerType BIGINT LongType LARGEINT StringType FLOAT FloatType DOUBLE DoubleType DECIMAL(M,D) DecimalType(M,D) DATE DateType DATETIME TimestampType CHAR(L) StringType VARCHAR(L) StringType STRING StringType ARRAY ARRAY MAP MAP STRUCT STRUCT Report issues or submit pull request
If you find any bugs, feel free to file a GitHub issue or fix it by submitting a pull request.
Contact us through the following mailing list.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4