Purpose

f2cc - ForSyDe-2-CUDA C - is a software synthesis tool developed as part of the ForSyDe project. It enables models to be synthesized into either C code or CUDA C code, the latter which can be compiled for parallel execution on a CUDA-enabled NVIDIA graphics card.

Overview

There are two versions available for f2cc: a stable version (0.1), and an experimental one (0.2pa). Their execution flow is shown in the picture below, and follows the same pattern for both branches:

  • parsing of a ForSyDe intermediate model and storing it in an internal representation;
  • applying a series of model-to-model transformations for exploiting different types of parallelism;
  • synthesizing backend code from the modified model.

The f2cc tool execution flow

Version 0.1 (stable)

f2cc v0.1 was developed by Gabriel Hjort Blindell as part of his Master thesis at KTH (Royal Institute of Technology), Sweden. For more information about f2cc, please consult his  thesis report.


This version of the tool inputs  GraphML files generated by the ForSyDe-Haskell modeling framework, with tool-specific annotations (found in section 9.2 of the  thesis report). It can handle models based on the synchronous MoC which consist of the following process types:

  • MapSY
  • ParallelMapSY
  • ZipWithNSY
  • UnzipxSY
  • ZipxSY
  • DelaySY

The tool recognizes simple data parallel patterns in a model (split-map-merge, as shown in the picture below), coalesces them into ParallelMapSY processes and generates CUDA code for such regions.

Parallel patterns identified by f2cc v0.1

Version 0.2pa (experimental)

f2cc v0.2 was improved by George Ungureanu as part of his Master thesis at KTH (Royal Institute of Technology), Sweden. For more information about this version of f2cc, please consult his  thesis report.


This version of the tool is still under development and it is unstable. It enhances v0.1, with the following new features:

  • recognizes and parses ForSyDe-SystemC generated XML files.
  • extracts C code from the ForSyDe-SystemC model files, which respect the coding conventions mentioned in Appendix A of the  thesis report.
  • internal model representation implements hierarchy and hierarchical model-to-model transformations.
  • dumps the internal model representation into XML files after each transformation.
  • for full exploitation of data parallelism, parallel sections are identified by analyzing process constructors and data dependencies, thus they are not restricted to only split-map-merge patterns.
  • uses an empirical and general cost-based platform model for representing the execution platform (in our case the CUDA host - CPU and the CUDA device - GPGPU).
  • implements simple platform mapping optimization algorithms, based on costs describing execution and data transfers.
  • exploits time parallelism by employing a heuristic load-balancing algorithm for splitting the contained sections (destined for execution on the GPGPU device) into balanced pipeline stages.
  • synthesizes pipelined CUDA code using CUDA streams (incomplete).

Currently, it handles ForSyDe-SystemC generated models using the SY MoC and containing the following leaf process types: comb; comb2; comb3; comb4; combX; zipX; unzipX; delay.

The transformations performed during the design flow are suggested by the animation below (click on the animation for a large version).

https://forsyde.ict.kth.se/files/f2cc_animation_thumb.gif

Currently, the tool generates both C and CUDA code, but in order to compile them, the user has to perform a number of manual corrections.


Usage

f2cc is a command-line driven program. It parses a model represented as  GraphML file (v0.1), or a ForSyDe-SystemC generated XML file (v0.2), and outputs the synthesized code as a set of header (.h) and C (.c) files.

To synthesize a ForSyDe-Haskell generated model in a file model.graphml, simply run:

./f2cc model.graphml

This will invoke the execution path in v0.1, regardless of the version of the tool you are using, and synthesize the model into CUDA C code using default synthesis parameters.

To synthesize a ForSyDe-SystemC generated model having the top module top_module.xml, run:

./f2cc top_module.xml

after preparing the environment as suggested in Appendix A of the  thesis report. This works only with v0.2, and will automatically invoke the execution path associated with this version. It will synthesize the model into CUDA C code using default synthesis parameters.

To see a list of all command switches, run:

./f2cc --help

The tool logs all actions into a log file (by default, the log file is output.log). The verbosity can be controlled using the --log-level switch. This can be useful when analyzing the synthesis process or debugging.

Attachments