Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Thanks to an agreement with MathworksMathWorks, CINECA provides several MATLAB licenses trough through its own license server that can be used on CINECA clusters.
Usage of the CINECA MATLAB licenses is allowed exclusively for Open Science (non-commercial) activities.
In case you are interested in using those licenses and you declare that your activity is devoted to Open Science, please write to superc@cineca.it to be enabled to use CINECA licenses.

...

This section provides the steps to configure MATLAB to submit jobs to a cluster, retrieve results, and debug errors.

It is possible to configure MATLAB in order to submit jobs on CINECA clusters directly from your local MATLAB installation (Remote Jobs submission) or by login nodes of CINECA clusters (on Cluster submission).

Remote

...

Jobs submission

It is possible to submit MATLAB jobs to the compute nodes of a CINECA cluster directly from your local MATLAB installation.

...

For different solutions you can refer to this Mathworks MathWorks dedicated User Guide page.

...

To manage the local cluster configuration in the top menu select "Parallel", then "Create and Manage Clusters..."
A window will be opened where you can modify the Additional Properties of your configuration based on your needs (See next Sections about a description of the Available Properties).

...

On Cluster submission

Alternatively to Remote submission, you can launch MATLAB jobs directly from login nodes of CINECA clusters.

Log-in to Marconi or Galileo100 cluster and load the MATLAB module:

$ module load profile/eng
$ module load autoload matlab/<version>

...

>> % Specify QoS
>> c.AdditionalProperties.QoS = 'name-of-qos';

>> % Specify the number of nodes.you want to use. 
>> c.AdditionalProperties.NumberOfNodes = 1;

>> % Specify processor cores per node.  Default is 18 for Marconi and 48 for Galileo100.
>> c.AdditionalProperties.ProcsPerNode = 18;

>> % Specify memory to use for MATLAB jobs, per core (default: 4gb)
>> c.AdditionalProperties.MemUsage = '6gb';

>> % Require node exclusivity
>> c.AdditionalProperties.RequireExclusiveNode = true;

>> % Request to use a reservation
>> c.AdditionalProperties.Reservation = 'name-of-reservation';

>> % Specify e-mail address to receive notifications about your job
>> c.AdditionalProperties.EmailAddress = ‘test@foo.com;

>> % Turn onthe Debug Message.  Default is off (logical boolean true/false).
>> c.AdditionalProperties.DebugMessagesTurnedOn = true;


On Galileo100 it is also possible to make use of GPUs for visualization and interactive jobs

...

>> % To clear a configuration that takes a string as input 
>> c.AdditionalProperties.EmailAddress = ‘ ’;

To save a profile, with your configuration so you will find it in future sessions

 >> c.saveProfile;

Serial Jobs

Use the batch command to submit asynchronous jobs to the cluster.  The batch command will return a job object which is used to access the output of the submitted job.  See the MATLAB documentation for more help on batch.

...

>> fetchOutputs(j)

Display the diary

>> diary(j)

Delete the job after results are no longer needed

>> j.delete;

To retrieve a list of currently running or completed jobs, call parcluster to retrieve the cluster object.  The cluster object stores an array of jobs that were run, are running, or are queued to run.  This allows us to fetch the results of completed jobs.  Retrieve and view the list of jobs as shown below.

>> c = parcluster;
>> jobs = c.Jobs;

Once we’ve identified the job we want, we can retrieve the results as we’ve done previously.

...

>> % Run a parfor over 1000 iterations
>> parfor idx = 1:1000
a(idx) = rand();
end

Once we’re done with the pool, delete it.

>> % Delete the pool
>> pool.delete;

Batch Jobs

Users can also submit parallel workflows with batch.

...

>> % Fetch the results after a finished state is retrieved

>> j.fetchOutputs{:}
>> ans = 15 15.5328


>> % Display the diary

>> diary(j)


The job ran in 15.53 sec. using 4 workers. 

...

>> % Fetch the results
>> j.fetchOutputs{:};
ans =
6.4488
>> % If necessary, retrieve output/error log file
>> c.getDebugLog(j)

...

It is an implementation of the HPCC Global HPL benchmark

>> function perf = hpccLinpack( m )

...

Start to submit on 1 core, with m=1024: 

>> j = c.batch(@hpccLinpack, 1, {1024}, 'Pool', 1)

...

Repeat on one full node on Marconi

>> j = c.batch(@hpccLinpack, 1, {1024}, 'Pool', 35)

...

Increase the size of the matrix, 

 j>> j = c.batch(@hpccLinpack, 1, {2048}, 'Pool', 35)
Data size: 0.031250 GB
Performance: 2.466961 GFlops
>> j = c.batch(@hpccLinpack, 1, {4096}, 'Pool', 35)

...

Data size: 0.125000 GB
Performance: 14.709730 GFlops

and double matrix size 

>> j = c.batch(@hpccLinpack, 1, {8192}, 'Pool', 71)
Data size: 0.500000 GB
Performance: 86.003520 GFlops

...

..

>> j = c.batch(@hpccLinpack, 1, {16384}, 'Pool', 35)
Data size: 2.000000 GB
Performance: 356.687648 GFlops

Debugging

If a serial job produces an error, we can call the getDebugLog method to view the error log file.

...