-----Original Message-----
From: Abhi Basu [mailto:9000revs@gmail.com]
Sent: 23 March 2018 20:42
To: solr-user@lucene.apache.org
Subject: Solr on HDInsight to write to Active Data Lake
MS Azure does not support Solr 4.9 on HDI, so I am posting here. I would
like to write index collection data to HDFS (hosted on ADL).
Note: I am able to get to ADL from hadoop fs command like, so hadoop is
configured correctly to get to ADL:
hadoop fs -ls adl://
This is what I have done so far:
1. Copied all required jars to sol ext lib folder:
sudo cp -f /usr/hdp/current/hadoop-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f
/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
/usr/hdp/current/solr/example/lib/ext
This includes the Azure active data lake jars also.
2. Edited my solr-config.xml file for my collection:
<dataDir>${solr.core.name}/data/</dataDir>
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
<str
name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/clusters/eso
hadoopdeveus2/solr/</str>
<str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
<str
name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
<bool name="solr.hdfs.blockcache.enabled">true</bool>
<int name="solr.hdfs.blockcache.slab.count">1</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
<int name="solr.hdfs.blockcache.blocksperbank">16384</int>
<bool name="solr.hdfs.blockcache.read.enabled">true</bool>
<bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
</directoryFactory>
When this collection is deployed to solr, I see this error message:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2189</int></lst>
<lst name="failure">
<str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Er
ror
CREATEing SolrCore 'ems-collection_shard2_replica2':
Unable to create core: ems-collection_shard2_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrE
xception:Error
CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
core: ems-collection_shard2_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrE
xception:Error
CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
core: ems-collection_shard1_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrE
xception:Error
CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
core: ems-collection_shard1_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str> </lst> </response>
Has anyone done this and can help me out?
Thanks,
Abhi
--
Abhi Basu
No comments:
Post a Comment