Project

General

Profile

MitySOM-5CSX DevKit OpenCL BSP

Critical Link is pleased to announce support for the Altera's SDK for OpenCL for the MitySOM-5CSX module.

The MitySOM-5CSX module (5CSX-H6-4YA configuration) has been tested to run examples using Altera's SDK for OpenCL integrated with version 16.0 of the Altera Quartus Prime and SoC Embedded developer's kit.

This documentation outlines using the SDK for OpenCL under a linux environment. Similar instructions should apply for Windows users. If you are having difficulty setting up the software please contact Critical Link or Altera.

For information on purchasing the MitySOM-5CSX, visit http://www.criticallink.com/product/mitysom-5csx/.

Useful Documentation Links

Pre-Built Binaries

A link to an SD-card image is available that includes support for running the Altera SDK for OpenCL SDK project examples is available below:

5CSX-H6-4YA MitySOM_5CSX_H6_4YA_Dev_Kit_OpenCL_Release_1.zip

Within the zip file is a mitysom5csx_devkit.img image file. Refer to Building SD Card Image for information on how to program an SD card with the provided image file for booting.

Required Tools

In order to compile an OpenCL project for the MitySOM-5CSX, you will need to download the latest version of Altera's SDK for OpenCL (which includes a copy of Quartus Prime) and install the Cyclone V device support libraries. You will also require the Altera SoC Embedded Design Suite (EDS).

The board support package files necessary to compile the example programs for the MitySOM-5CSX module on the Critical Link MitySOM-5CSX developer's kit base board can be downloaded from our git repository server.

In order to build an accelerated project for the MitySOM-5CSX installed on a devkit, you will need to load the board files / reference project data. This can be accomplished by fetching the data from our git server. On linux, this command would be:

user@machine# git clone git://support.criticallink.com/home/git/mitysom-5cs/mitysom_5csx_dev_board.git
user@machine# cd mitysom_5csx_dev_board
user@machine# git checkout master

Compiling an example project for the MitySOM-5CSX Developer's Kit

These instructions are for compiling using a 64-bit linux system with a bash shell. They should be similar for a windows system.

First, the correct environment should be configured to support the aocl build tool.

user@machine# export ALTERA=/home/${USER}/altera/16.0
user@machine# export ALTERAOCLSDKROOT="${ALTERA}/hld" 
user@machine# export PATH=$PATH:$ALTERAOCLSDKROOT/bin
user@machine# export LD_LIBRARY_PATH=$ALTERAOCLSDKROOT/host/linux64/lib:$LD_LIBRARY_PATH
user@machine# export LD_LIBRARY_PATH=$AOCL_BOARD_PACKAGE_ROOT/linux64/lib:$LD_LIBRARY_PATH
user@machine# export QSYS_ROOTDIR="${ALTERA}/quartus/sopc_builder/bin" 
user@machine# export AOCL_BOARD_PACKAGE_ROOT=/home/${USER}/mitysom_5csx_dev_board/opencl/board
user@machine# . ${ALTERA}/embedded/embedded_command_shell.sh

Once the build environment is configured, the hello_world.aocx and ARM host application can be built with the following commands:

user@machine# cd /home/$USER/mitysom_5csx_dev_board/opencl/examples/hello_world
user@machine# aoc --board dev_5csx_h6_4ya -v device/hello_world.cl -o bin/hello_world
user@machine# make

Running an example project on the MitySOM-5CSX Developer's Kit

IMPORTANT: the Altera SDK for OpenCL packs the Cyclone V FPGA fabric configuration data as compressed RBF files. This means you will need to configure the MitySOM-5CSX module boot configuration to support compressed image configuration. On the DevKit, this means you need to set the S100 configuration header to the following (S1 through S10) : 0100000101. See the figure below for the switch positions.

The best way to start is to use the SD-Card image included in the BSP download package. Create the SD card according to the instructions and insert the card into the DevKit. The DevKit should boot up to a linux prompt. Log in as root.

At the root command shell, run the OpenCL initialization script to configure your path to include the runtime OpenCL libraries and executables and load the OpenCL FPGA communications driver:

root@mitysom-5csx-h6-4ya:~# . ./init_opencl.sh 
aclsoc_drv: module is from the staging directory, the quality is unknown, you have been warned.
root@mitysom-5csx-h6-4ya:~#

Note: the warning is due to the fact that Critical Link has incorporated Altera's OpenCL driver into the kernel staging folder in order to support "in-tree" builds of the kernel module. This warning can be safely ignored and will go away once the driver is migrated out of the staging area of the kernel source tree.

Once the environment is loaded, you can then run the hello_world and vector_add examples as is shown below.

root@mitysom-5csx-h6-4ya:~# cd opencl_examples/
root@mitysom-5csx-h6-4ya:~/opencl_examples# ./hello_world 
Querying platform for info:
==========================
CL_PLATFORM_NAME                         = Altera SDK for OpenCL
CL_PLATFORM_VENDOR                       = Altera Corporation
CL_PLATFORM_VERSION                      = OpenCL 1.0 Altera SDK for OpenCL, Version 16.0.2

Querying device for info:
========================
CL_DEVICE_NAME                           = dev_5csx_h6_4ya : Cyclone V SoC Development Kit
CL_DEVICE_VENDOR                         = Altera Corporation
CL_DEVICE_VENDOR_ID                      = 4466
CL_DEVICE_VERSION                        = OpenCL 1.0 Altera SDK for OpenCL, Version 16.0.2
CL_DRIVER_VERSION                        = 16.0
CL_DEVICE_ADDRESS_BITS                   = 64
CL_DEVICE_AVAILABLE                      = true
CL_DEVICE_ENDIAN_LITTLE                  = true
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE          = 32768
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE      = 0
CL_DEVICE_GLOBAL_MEM_SIZE                = 536870912
CL_DEVICE_IMAGE_SUPPORT                  = true
CL_DEVICE_LOCAL_MEM_SIZE                 = 16384
CL_DEVICE_MAX_CLOCK_FREQUENCY            = 1000
CL_DEVICE_MAX_COMPUTE_UNITS              = 1
CL_DEVICE_MAX_CONSTANT_ARGS              = 8
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE       = 134217728
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS       = 3
CL_DEVICE_MEM_BASE_ADDR_ALIGN            = 8192
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE       = 1024
CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR    = 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT   = 2
CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT     = 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG    = 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT   = 1
CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE  = 0
Command queue out of order?              = false
Command queue profiling enabled?         = true
Using AOCX: hello_world.aocx
Reprogramming device with handle 1

Kernel initialization is complete.
Launching the kernel...

Thread #2: Hello from Altera's OpenCL Compiler!

Kernel execution is complete.
root@mitysom-5csx-h6-4ya:~/opencl_examples#
root@mitysom-5csx-h6-4ya:~/opencl_examples#./vector_add 
Initializing OpenCL
Platform: Altera SDK for OpenCL
Using 1 device(s)
  dev_5csx_h6_4ya : Cyclone V SoC Development Kit
Using AOCX: vector_add.aocx
Reprogramming device with handle 1
Launching for device 0 (1000000 elements)

Time: 160.653 ms
Kernel time (device 0): 7.767 ms

Verification: PASS
root@mitysom-5csx-h6-4ya:~/opencl_examples#

Integrating Non-OpenCL Logic with OpenCL kernels.

If you are trying to do more than simply accelerate the ARM using the Cyclone V FPGA fabric, you'll need to modify your FPGA board system to include your additional logic.

Common scenarios that require modification of the reference board design might include:

- You have custom ARM controlled interfaces (e.g., PIO) that the ARM needs to access via the light weight HPS to fpga (lwhps2fpga) bridge.
- You have Avalon streaming input interfaces that you want to feed the OpenCL kernels.
- You have Avalon streaming output interfaces that you want to drive from OpenCL kernels.

Critical Link provides a simple example that illustrates handling the first 2 of the mentioned scenarios. The example consists of two parts, a modified reference board setup that includes the added FPGA fabric logic, and an example OpenCL reference project that utilizes the additional fabric. A block diagram of the example is illustrated below:

The example design updates the reference FPGA QSYS project with the following modifications:

- Exposes a portion of the lwhps2fpga bus bridge to the system level (address region 0xFF210000-0xFF21FFFF) from the acl_iface_system block.
- Exposes the fpgs2hps bus interconnect (64 bits in width) from the acl_iface_system block.
- Instantiates an Altera Memory Mapped Master to Avalon Stream (Read Master and Controller Block) Scatter-Gather DMA Engine. The SGDMA Read Master can read from the fpgs2hps bus and generates an Avalon Stream output, left unconnected and available as an OpenCL Channel IO connection point.

The example OpenCL reference project provides a simple OpenCL file that reads a configured number of words from the Avalon Stream Source and writes it to memory (using the fpag2sdram bridge). The ARM software sets up a simple case to transmit a known counting patterning via the SGDMA bus to the Avalon Stream, which then feeds the OpenCL based kernel. The ARM software runs the core and compares that the input and output buffers match.

Building the Example Channel IO project.

The example data can be obtained from Critical Link's git repository with the following (linux) command:

user@host # git clone git://support.criticallink.com/home/git/mitysom-5cs/mitysom_5csx_dev_board.git

The board file is located in the mitysom_5csx_dev_board/opencl/boards/dev_5csx_h6_4ya_wchannels and the opencl example project is located in mitysom_5csx_dev_board/opencl/example/channel_test.

To build the FPGA AOCX file and the ARM host application run the following commands.

user@host # cd mitysom-5cs/mitysom_5csx_dev_board
user@host # export AOCL_BOARD_PACKAGE_ROOT=`pwd`/opencl/board
user@host # cd opencl/examples/channel_test
user@host # make
user@host # aoc -v --board dev_5csx_h6_4ya_wchannels device/channel_test.cl -o bin/channel_test

Running the Example Channel IO project

This particular project leverages the modular scatter gather DMA engine from user space by memory mapping in /dev/mem. In order for this technique to work, you must reserve the upper 256 MB of physical memory so that the kernel will not attempt to manage or use it for virtual memory mapped operations. In order to run this example you must update the boot arguments passed into the kernel to include "mem=768M". This will reserve the top 256 MB of DDR RAM for use with the SGDMA engine.

Additional OpenCL examples

MitySOM_5CSX_H6_4YA_Dev_Kit_OpenCL_Release_1.zip (137 MB) Michael Williamson, 10/07/2016 09:38 AM

IMG_20161006_131311145.jpg View (46.4 KB) Michael Williamson, 10/07/2016 09:50 AM

BlockDiagram.png View (49 KB) Michael Williamson, 10/18/2016 07:18 AM

Add picture from clipboard (Maximum size: 500 MB)