How to integrate Matlab with Hadoop Cluster??

1 view (last 30 days)
hi, I am trying to make Matlab+Hadoop Cluster. I run matlab program on this cluster but getting this Errors at data nodes.
I am using Matlab version: R2016b, hadoop-2.7.2
Error1 : 2017-02-02 15:52:41,962 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : com.mathworks.toolbox.parallel.hadoop.worker.RemoteMvm$CommunicationLostException at com.mathworks.toolbox.parallel.hadoop.worker.RemoteMvm.feval(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.HadoopMatlabWorker.configureWorker(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.HadoopMatlabWorker.<init>(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.MatlabWorkerSingleton.getOrCreateWorker(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.MatlabMapper.setup(Unknown Source) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at com.mathworks.toolbox.parallel.hadoop.MatlabReflectionMapper.run(Unknown Source) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error 2:
2017-02-02 15:53:34,345 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From slave2/10.70.0.102 to slave3:40326 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.call(Client.java:1479) at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:242) at com.sun.proxy.$Proxy8.getTask(Unknown Source) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:132) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712) at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528) at org.apache.hadoop.ipc.Client.call(Client.java:1451) ... 4 more
  2 Comments
Rick Amos
Rick Amos on 15 Feb 2017
This error message typically means the MATLAB Worker either was killed or crashed and there are several things to check. First, do the errors go away when you run the follow command in MATLAB just before running mapreduce:
distcomp.feature('HadoopReuseWorker', 'No')
Further, if the errors still appear, do you see crash dumps when you look at the failed task view in the Hadoop scheduler web UI?
This can be accessed by:
  1. Go to the Hadoop scheduler web UI (typically hostname:8088)
  2. Find one of your jobs, it should have name "MATLAB Parallel Computing Job" and be submitted by you.
  3. Click on "History" (or "ApplicationMaster" if it is still running)
  4. Under the "Attempt Type" table, click on the number that appears in the "Failed" column.
  5. This should show a table of task attempts. Click on "logs" for any one of the attempts.
  6. This should give you a list of log files, including stderr/stdout/syslog. If it speaks about "Aggregation is not enabled", you will need to either configure the cluster with log aggregation, or do these steps while the job is still running.
  7. If there is a matlab_crash_dump file, that will contain useful information about what went wrong.
Pulkesh  Haran
Pulkesh Haran on 29 Apr 2017
Hi,
I am getting this Error :
Parallel mapreduce execution on the Hadoop cluster: ****************************** * MAPREDUCE PROGRESS ******************************* Map 0% Reduce 0% Map 87% Reduce 0% Error using mapreduce (line 118) The HADOOP job failed to complete. Check the HADOOP log files for job 1 for more information.
Error in init (line 30) maxHSV = mapreduce(imageDS, @hueSaturationValueMapper, @hueSaturationValueReducer,'OutputFolder',output_1_LAKH)
Hadoop Data Node Exception :
2017-04-29 21:29:12,067 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2017-04-29 21:29:18,901 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2017-04-29 21:29:18,918 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Transport stopped. at com.mathworks.mvm.MvmFactory.nativeGetFactory(Native Method) at com.mathworks.mvm.MvmFactory.<init>(MvmFactory.java:426) at com.mathworks.mvm.MvmFactory.createFactory(MvmFactory.java:369) at com.mathworks.mvm.MvmFactory.createFactory(MvmFactory.java:292) at com.mathworks.toolbox.parallel.hadoop.worker.MvmPool.createMvm(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.worker.MvmPool.getOrCreateMvm(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.HadoopMatlabWorker.<init>(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.MatlabWorkerSingleton.getOrCreateWorker(Unknown Source) at com.mathworks.toolbox.parallel.hadoop.link.MatlabMapper.setup(Unknown Source) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at com.mathworks.toolbox.parallel.hadoop.MatlabReflectionMapper.run(Unknown Source) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2017-04-29 21:29:18,926 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
please find attachment.
Thanks..

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!