#
#  ======== readme.txt ========
#

Overview
=========================================================================
This example illustrates how to configure IPC for a large scale system.
It also provides a starting point for implementing a secondary transport.

We consider a hypothetical system which contains over 4000 processors.
This system contains 16 boards each with 32 devices for a total of 512
devices. Each device contains 9 processors; one HOST and eight DSPs. This
make a total of 4,608 processors.

To manage such a large processor pool, we define a cluster which partitions
the processor pool into small local groups. Each device will define a
single cluster. The primary transports will be used to communicate within
a cluster; secondary transports are used to communicate between clusters.

Reserved message queues are introduced. These require special handling
in configuration. They are efficient for large scale systems because they
do not require a run-time lookup. You can use direct messaging when using
reserved message queues.

This example builds a single DSP executable which is loaded onto every
processor. Before loading the DSP, the executable is patched with the
cluster baseId for that device. The patched executable is then loaded
onto each of the DSP processors of that device. This would be repeated
for each device in the system.

A single LAD executable is used on all the devices. When starting LAD,
command line arguments are used to specify the number of reserved message
queues, the total number of processors in the system, and the cluster
baseId. Optionally, you can build LAD with the number of reserved queues
and total number of processors built-in, because these values are the same
on all devices. However, the cluster baseId should always be specified on
the command line. LAD is built in the IPC product.


Build Instructions
=========================================================================

 1. Create a work folder on your file system.

    mkdir work

 2. Extract this example into your work folder.

    cd work
    unzip ex44_compute.zip

 3. Setup the build environment. Edit products.mak and set the install paths
    as defined by your physical development area. Each example has its own
    products.mak file; you may also create a products.mak file in the parent
    directory which will be used by all examples.

    edit ex44_compute/products.mak

    TOOLCHAIN_INSTALL_DIR = <...>/_your_linux_gcc_toolchain_install_
    BIOS_INSTALL_DIR = <...>/bios_m_mm_pp_bb
    IPC_INSTALL_DIR = <...>/ipc_m_mm_pp_bb
    XDC_INSTALL_DIR = <...>/xdctools_m_mm_pp_bb
    ti.targets.elf.C66 = <...>/c6000_m_m_p

    Your DESTDIR must point to your IPC "install" location. In other words,
    the location you specified when you ran 'make install' within the IPC
    product.

    DESTDIR = <...>

 4. Build the example. This will build only debug versions of the executables.
    Edit the lower makefiles and uncomment the release goals to build both
    debug and release executables.

    cd ex44_compute
    make

    Use the following commands to clean your example.

    cd ex44_compute
    make clean

 5. Copy the HOST executable, the DSP executable, and the supporting
    scripts to your target file system.

    ex44_compute/host/bin/debug/app_host
    ex44_compute/dsp/bin/debug/compute_dspN.xe66
    ex44_compute/scripts/patchExec.pl
    ex44_compute/scripts/run_dsp.sh
    ex44_compute/scripts/run_host.sh
    ex44_compute/scripts/run_lad.sh
    ex44_compute/scripts/run_patch.sh
    ex44_compute/scripts/stop_dsp.sh

 6. Optional. This example uses command line arguments to launch LAD.
    These instructions are here for reference.

    Configure and build LAD. This examples builds for the TCI6638 device.
    Use the appropriate config file for your device.

    cd <IPC Install>/linux/src/daemon/cfg
    edit MultiProcCfg_tci6638.c

    Add the following header file:

        #include <ti/ipc/MultiProc.h>

    Make the following changes:

        .numProcessors = 4608,
        .id = MultiProc_INVALIDID,
        .baseIdOfCluster = MultiProc_INVALIDID

    Configure LAD to set aside reserved message queues. This example
    uses 8 reserved message queues. Edit the following file to reserve
    message queues in LAD.

    cd <IPC Install>/linux/src/daemon/cfg
    edit MessageQCfg.c

    Make the following changes:

        .numReservedEntries = 8

    Rebuild your LAD executable. Copy the LAD executable to your
    target file system.

    <IPC Install>/linux/src/daemon/lad_tci6638


Running The Example
=========================================================================
This example illustrates how to use one DSP executable and one LAD
executable for all your devices. To achieve this, the DSP executable
must be patched for each device and the LAD daemon must be started
with a command line option to specify the cluster baseId.

In our hypothetical system, we use one cluster for each device. Each
cluster has 9 members (to match the number of processors on the device).
The cluster baseId is the processor number of the first processor in
the cluster (base-zero).

For example, the clusters would be mapped out as follows:

    Device #  BaseId  Cluster Members
    =========================================================================
        1         0   HOST    (0), DSP1    (1), DSP2    (2), ..., DSP8    (8)
        2         9   HOST    (9), DSP1   (10), DSP2   (11), ..., DSP8   (17)
        3        18   HOST   (18), DSP1   (19), DSP2   (20), ..., DSP8   (26)
       :         :      :
      511      4590   HOST (4590), DSP1 (4591), DSP2 (4592), ..., DSP8 (4598)
      512      4599   HOST (4599), DSP1 (4600), DSP2 (4601), ..., DSP8 (4607)

To load and run on any given device, use the BaseId for that device when
you patch your DSP executable and when starting LAD.

 1. Patch the DSP executable with the cluster baseId. For example, if
    running on Device #3, the BaseId would be 18 (see table above). The
    perl script makes a copy of the DSP executable file.

    perl patchExec.pl 18 compute_dspN.xe66 compute_dspN_patched.xe66

    You can also use the helper script provided with the example.

    run_patch.sh 18

    This will generate a new patched DSP executable.

    compute_dspN_patched.xe66

 2. Start LAD. You must start the LAD daemon before running any IPC
    program. Use the same BaseId that you used above when patching
    the DSP executable.

    lad_tci6638 -r 8 -n 4608 -b 18 -l log.txt

    You can also use the helper script provided with the example.

    run_lad.sh 18

    Note: If you rebuilt LAD with the specified number of reserved
    message queues and total processors in the system, you should omit
    the '-r' and '-n' options above.

 3. Load and run the DSP processors. You must load the same patched DSP
    executable onto each of the DSP processors of your device. Use the
    loader provided with the MCSDK (mpmcl).

    mpmcl load dsp0 compute_dspN_patched.xe66
    mpmcl load dsp1 compute_dspN_patched.xe66
    ...

    Finally, run all the DSP processors.

    mpmcl run dsp0
    mpmcl run dsp1
    ...

    You can also use the helper script provided with the example.

    run_dsp.sh

 4. Run the HOST application program. The application program runs on
    the host processor. It will exchange a few messages with each DSP
    and then shutdown.

    app_host

 5. Use CCS to inspect the DSP logs. Load the DSP symbols into CCS and
    then attach to each of the DSP processors. Open the Real-Time Object
    Viewer (ROV) and select the LoggerBuf module. Inspect the log events.

 6. If you wish to run the example again, you must first stop and reload
    the DSP processors. Use the helper scripts to make this easier. Note:
    you do not need to restart the LAD daemon.

    stop_dsp.sh
    run_dsp.sh
    run_host.sh
