#
#  ======== readme.txt ========
#

Overview
=========================================================================
This example performs on-demand loading of DSP processors and illustrates
how to establish IPC communication.

There are three host programs: the manager, consumer, and producer. The
manager program must be started first. It is used to start and stop all
other programs in the example. The manager is also used for sending commands
to the other programs. The producer simply generates data in order to
exercise the connections. The consumer is used for terminating the data
path. You can start only one manager, but you can have multiple producers
and consumers running concurrently.

There are two DSP programs: the combiner and transformer. The transformer
has one input and one output. It is used to process the input data and to
generate new output data. The combiner has two inputs and one output. Its
purpose is simply to combine two data paths into one.

With these programs, you can build multiple data graphs. Starting and
stopping a program illustrates how to initialize and finalize the IPC
framework. Making connections illustrates how to attach IPC to various
end points. Starting and stopping the data flow illustrates how to send
messages.

For example, the following data graph illustrates how to run IPC programs
on the host only (without any programs running on the DSP).

    producer [HOST] --> consumer [HOST]

The following graph adds the DSP.

    producer [HOST] --> transformer [DSP1] --> consumer [HOST]

Here is an example with two producers. The combiner is used to join the
two data streams into one.

    producer --> transformer --
                               \
                                +--> combiner --> consumer
                               /
    producer --> transformer --


Known Issues

 A. Producer node will hang in shutdown phase if connected to a combiner
    node. Connect the producer to either a consumer or a transformer.

    The pause command is not handled correctly. The pump thread in the
    producer node will send a message downstream after having received
    the pause command. This creates a deadlock in the shutdown phase.
    This is not a problem when connected to a consumer or transformer
    because those nodes automatically return the message after processing
    the data. However, the combiner node might not return the message
    because it needs a corresponding data message on the other input
    connection.


Build Instructions
=========================================================================

 1. Create a work folder on your file system.

    mkdir work

 2. This example uses the transport built in the ex45_host example. You
    must install and build the ex45_host example first.

    cd work
    unzip ex45_host.zip

    Follow the instructions in ex45_host/readme.txt to build the example.

 3. Extract this example into your work folder.

    cd work
    unzip ex46_graph.zip

 4. Setup the build environment. Edit products.mak and set the install paths
    as defined by your physical development area. Each example has its own
    products.mak file; you may also create a products.mak file in the parent
    directory which will be used by all examples.

    edit ex46_graph/products.mak

    TOOLCHAIN_LONGNAME = arm-linux-gnueabihf
    TOOLCHAIN_INSTALL_DIR = <...>/linaro-arm-gnueabihf_4_7_2013_03
    TOOLCHAIN_PREFIX = $(TOOLCHAIN_INSTALL_DIR)/bin/$(TOOLCHAIN_LONGNAME)-

    IPC_INSTALL_DIR = <...>/ipc_m_mm_pp_bb

    Your DESTDIR must point to your IPC "install" location. In other words,
    the location you specified when you ran 'make install' within the IPC
    product.

    DESTDIR = <...>

    Note: To build this example, you must install IPC into DESTDIR.

 5. Build the example. This will build only debug versions of the executables.
    Edit the lower makefiles and uncomment the release goals to build both
    debug and release executables.

    cd ex46_graph
    make

 6. Issue the following commands to clean your example.

    cd ex46_graph
    make clean

 7. Copy the executables and supporting scripts to your target file system.

    ex46_graph/combiner/bin/debug/combinerN.xe66
    ex46_graph/consumer/bin/debug/consumer
    ex46_graph/manager/bin/debug/manager
    ex46_graph/producer/bin/debug/producer
    ex46_graph/scripts/patchExec.pl
    ex46_graph/scripts/run_lad.sh
    ex46_graph/scripts/run_patch_combiner.sh
    ex46_graph/scripts/run_patch_transformer.sh
    ex46_graph/scripts/vritio.awk
    ex46_graph/transformer/bin/debug/transformerN.xe66

 8. Copy the LAD executable from the IPC product to your target file system.

    <DESTDIR>/usr/bin/lad_tci6638


Patching The DSP Executables
=========================================================================
This example illustrates how to use one DSP executable and one LAD
executable for all your devices. To achieve this, the DSP executable
must be patched for each device and the LAD daemon must be started
with a command line option to specify the cluster baseId.

In our hypothetical system, we use one cluster for each device. Each
cluster has 9 members (to match the number of process on the device).
The cluster baseId is the processor number of the first processor in
the cluster (base-zero).

For example, the clusters would be mapped out as follows:

    Device #  BaseId  Cluster Members
    =========================================================================
        1         0   HOST    (0), DSP1    (1), DSP2    (2), ..., DSP8    (8)
        2         9   HOST    (9), DSP1   (10), DSP2   (11), ..., DSP8   (17)
        3        18   HOST   (18), DSP1   (19), DSP2   (20), ..., DSP8   (26)
       :         :      :
      511      4590   HOST (4590), DSP1 (4591), DSP2 (4592), ..., DSP8 (4598)
      512      4599   HOST (4599), DSP1 (4600), DSP2 (4601), ..., DSP8 (4607)

To load and run on any given device, use the BaseId for that device when
you patch your DSP executable and when starting LAD.

 1. Patch the combiner executable with the cluster baseId. For example, if
    running on Device #3, the BaseId would be 18 (see table above). The
    perl script makes a copy of the DSP executable file.

    perl patchExec.pl 18 combinerN.xe66 combinerN_p.xe66

    You can also use the helper script provided with the example.

    run_patch_combiner.sh 18

    This will generate a new patched DSP executable.

    combinerN_p.xe66

 2. Patch the transformer executable as above.

    perl patchExec.pl 18 transformerN.xe66 transformerN_p.xe66

    You can also use the helper script provided with the example.

    run_patch_transformer.sh 18

    This will generate a new patched DSP executable.

    transformerN_p.xe66


Running The Example
=========================================================================
 1. Start LAD. You must start the LAD daemon before running any IPC
    program. If it is not already running, use the following command
    to start it.

    lad_tci6638 -s PAIR -n 4608 -b <baseId> -r 8 -l log.txt

    You can also use the helper script provided with the example.

    run_lad.sh <baseId>

 2. Run the manager program in an interactive shell on your target.
    The manager will accept keyboard input from the shell. Use the
    manager to interact with the rest of this example.


Focus Points
=========================================================================

 *. Control thread must start IPC first. User thread must wait until
    IPC has been started.

 *. Control thread must wait to stop IPC until user thread is finished.

 *. Command loop cannot have nested loops. Must use deferred commands.

 *. On host, you call Ipc_attach only once. Not in a loop as you would
    on the DSP.

 *. After starting a new DSP, all running host IPC programs must attach
    to new DSP (if they want to communicate with it).

    Before shutting down a DSP, all host IPC programs must detach from DSP.

 *. When starting a new host IPC program, it must attach to all running
    DSPs (if it wants to communicate with it).

    Before shutting down a host program, it must detach from all DSPs.

 *. Each DSP creates its own message pool for output data buffers. Use a
    SYS/BIOS heap because all messages are round-trip. Place heap in
    shared memory. Acquire memory from shared region heap. When computing
    heap size, must account for alignment.
