===============================
 GRAPE Software Package
 grapepkg version 1.2.0
===============================

GRAPE Software Package (grapepkg) is a collection of user libraries,
utilities, documents, and sample programs for GRAPEs. It supports not
only KFCR's GRAPE-DR and GRAPE-7, but also GRAPE-6A, GRAPE-6BX, and
Phantom-GRAPE-5. An optional package CUDA G5/G6 provides G5 and G6
library for NVIDIA's CUDA devices (i.e. GPUs), too.

The package includes:

    ./00readme          --  a brief instruction of the package.
    ./00readme-j        --  00readme in Japanese.
    ./doc/              --  user's guide, reference manual, and other documents.
    ./script/           --  install & backup scripts.
    ./include           --  header files.
    ./lib/              --  libraries.
    ./driver/           --  a device driver for GRAPE-DR and GRAPE-7.
    ./hibutil/          --  the Host Interface Bridge (HIB) for GRAPE-DR and GRAPE-7.
    ./gdr/              --  softwares for GRAPE-DR.
    ./g7/               --  softwares for GRAPE-7.
    ./g6a/              --  softwares for GRAPE-6A.
    ./g6bx/             --  softwares for GRAPE-6BX.
    ./pg5/              --  softwares for Phantom-GRAPE-5.
    ./cuda/             --  softwares for CUDA devices. [included in the optional package CUDA G5/G6]
    ./sample/           --  sample programs.
    ./sample/direct/    --  a sample program (direct-summation algorithm, equal-timestep, written in C).
    ./sample/directf/   --  a sample program (direct-summation algorithm, equal-timestep, written in Fortran).
    ./sample/vtc/       --  a sample program (Barnes-Hut tree algorithm, equal-timestep, written in C).
    ./sample/s9/        --  a sample program (direct-summation algorithm, equal-timestep, written in C).
    ./sample/s8/        --  a sample program (direct-summation algorithm, individual-timestep, written in C).
    ./sample/s8f/       --  a sample program (direct-summation algorithm, individual-timestep, written in Fortran).
    ./sample/pairwise/  --  a sample program to check accracy of the pairwise force, written in C.
    ./init/             --  snapshots of particle distributions . used for functionality test of the hardwares.
    ./tmp/              --  used by test programs and other utilities.
    ./ttf/              --  bitstreams (.ttf files) for GRAPE-7 model300 and model600.
                            this directory is optional. your package may not include this directory.
                            you can download this directory from our Web site (http://www.kfcr.jp/grape7.html).

Contents
========
1. Installation and Preparation
  1.1 Installation of the Package
  1.2 Preparation of GRAPE-DR at Boot Time
  1.3 Preparation of GRAPE-7 at Boot Time
  1.4 Preparation of GRAPE-6A at Boot Time
  1.5 Preparation of GRAPE-6BX at Boot Time
  1.6 Preparation of Phantom-GRAPE-5 at Boot Time
  1.7 Preparation of CUDA G5/G6 at Boot Time
2. Compilation and Linkage
3. Environment Variables
  3.1 GDEVICE : Assignment of Calculation Resources
  3.2 GWARNLEVEL : Warning Message Control
4. Sample Programs
5. Compatibility
6. Additional API for CUDA G5
  6.1 Support for Multiple Walks Method
7. Additional API for CUDA G6
  7.1 Optimization for J-Paricle Transfer
8. Tested Platforms
9. References
10. License and Copyright
11. Acknowledgement
12. Modification History
13. Contact

1. Installation and Preparation
==============================

1.1 Installation of the Package
-------------------------------

In order to install the package, type

    $pkgroot/script/install

and follow its instructions.
Here, $pkgroot denotes the topmost directory of the package.


1.2 Preparation of GRAPE-DR at Boot Time
----------------------------------------

You need to follow the procedure
(1)--(4) everytime you restart the
host computer.

(1) [Root Permission Required] Load the Device Driver 

    Change directory to $pkgroot/driver/, and then type

        make installmodule

    This will plug-in the HIB driver for GRAPE-DR into the Linux
    kernel.  If the driver is successfully loaded, you should see a
    word "hibdrv" in the output of /sbin/lsmod.


(2) [Root Permission Required] Set up MTRR 

    Change directory to $pkgroot/driver/, and then type

        ./setmtrr

    This will set MTRR (memory type range register) of the host
    computer to "write-combining" mode, which improve speed of
    Programmed I/O Write (PIOW) data transfer. This setting affects
    performance of sending data from the host computer to GRAPE-DR.


    Note 1:
    In some case MTRR cannot be set up to "write-combining" mode (For
    example, all 8 existing MTRR are already assigned to other PCI
    devices or main memory regions, or, the total size of the main
    memory exceeds 4GB and the chipset of the mother board does not
    support I/O mapping to the main memory address higher than
    4GB). In such cases, you can continue all installation procedure
    described below, without MTRR set up.  All functions of GRAPE-DR
    should work without problem, except that the speed of data
    transfer from the host computer to GRAPE-DR would be reduced by
    50% or more.

    Note 2:

    MTRR set up is not necessary if the Linux kernel version is 2.6.26
    or higher, and PAT (page attribute table) support is enabled. In
    order to know whether your kernel suppot PAT or not, you can check
    the Linux header files (e.g. /usr/src/linux/include/linux/autoconf.h)
    to see if CONFIG_X86_PAT is defined or not.

(3) Initialize GRAPE-DR

    The following command initializes GRAPE-DR, and performs some
    basic tests.

        $pkgroot/script/config

    [For Users of model450 and model1800]
    For the initialization of model450 and model1800, use

        $pkgroot/gdr/test/config450
    and
        $pkgroot/gdr/test/config1800

    instead of $pkgroot/script/config.

(4) Check the Functionality of GRAPE-DR 

    The following command performs some many-body simulations and
    compare the results with precalculated ones.

        $pkgroot/script/check

    [For Users of model450 and model1800]
    For the initialization of model450 and model1800, use

        $pkgroot/gdr/test/check450
    and
        $pkgroot/gdr/test/check1800

    instead of $pkgroot/script/check.


1.3 Preparation of GRAPE-7 at Boot Time
---------------------------------------

You need to follow the procedure
(1)--(3) everytime you restart the host
computer.

(1) Follow the procedure 1.2-(1) and 1.2-(2)
    in section "Preparation of GRAPE-DR at Boot Time" to load the
    device driver, and set up MTRR. 

(2) [For model300 and model600 Only] Configure the FPGAs

    Configure FPGAs with G5PIPE backend logic (pipelines for
    gravitational force calculation). This procedure is not necessary
    for model 100 and model 800.


    Change directory to $pkgroot/g7/config/, and then type

        ./config [devid]

    This will reconfigure the FPGA(s) on the devid-th
    card. If the argument devid is not given, device ID 0
    is assumed.  If you have only one GRAPE-7 card in the system, its
    device ID should always be 0, and thus you can omit the
    argument. If you have multiple cards, you need to confirm the
    device ID of the card to be configured.  Use a command
    $pkgroot/script/lsgrape to identify the device ID.


(3) Follow the procedure 1.2-(4)
    in section "Preparation of GRAPE-DR at Boot Time" to check the
    functionality of the card(s). 

More descriptions for installation and usage of GRAPE-7 can be found
in "GRAPE-7 Installation Guide" ($pkgroot/doc/g7install.pdf).  For
usage of G5PIPE, refer to "G5PIPE User's Guide"
($pkgroot/doc/g5user.pdf).


1.4 Preparation of GRAPE-6A at Boot Time
----------------------------------------

(1) [Root Permission Required] Load the Device Driver

    Change directory to $pkgroot/g6a/pcimem, and then type

        make installmodule

    This will plug-in the driver for GRAPE-6A into the Linux
    kernel.  If the driver is successfully loaded, you should see a
    word "pcimem" in the output of /sbin/lsmod.

(2) Initialize GRAPE-6A

    Change directory to $pkgroot/g6a/lib, and then type

        g6aconfig

    to initialize the card.

(3) Check the Functionality of GRAPE-6A
    
    Change directory to $pkgroot/g6a/s8, and then type

        make s8
        s8

    After the completion of the command, compare the result with
    $pkgroot/g6a/s8/sample.1k.


1.5 Preparation of GRAPE-6BX at Boot Time
-----------------------------------------

(1) [Root Permission Required] Load the Device Driver

    Change directory to $pkgroot/g6bx/pcixmem, and then type

        make installmodule

    This will plug-in the driver for GRAPE-6BX into the Linux
    kernel.  If the driver is successfully loaded, you should see a
    word "pcixmem" in the output of /sbin/lsmod.


(2) Check the Functionality of GRAPE-6BX
    
    Change directory to $pkgroot/g6bx/s8, and then type

        make s8
        s8

    After the completion of the command, compare the result with
    $pkgroot/g6bx/s8/sample.1k.


1.6 Preparation of Phantom-GRAPE-5 at Boot Time
-----------------------------------------------

    No initialization procedure is necessary for Phantom-GRAPE-5 at boot time.

1.7 Preparation of CUDA G5/G6 at Boot Time
-----------------------------------------------

    Note : By default, CUDA devices are not supported by the GRAPE
           Software Package. You need an optional package CUDA G5/G6
           in order to use them.

(1) Set up CUDA Environment and Install the Package

    Before you run the installation script $pkgroot/script/install,
    you need to set up CUDA developing environment including the
    device driver, the Toolkit and the SDK, provided by NVIDIA.

      Quick Guide for CUDA Environment Installation:

        1) Download the following packages from the NVIDIA's web site.
          a) Developer Drivers for Linux
          b) CUDA Toolkit
          c) GPU Computing SDK code samples

        2) Run the package a and b, and follow their instruction to
           install the driver and the toolkit. Note that root
           permission is required for this procedure.

        3) Set an environment variable PATH to include $cudapath/bin
           Also set LD_LIBRARY_PATH to include $cudapath/lib64.
           Here, $cudapath is the path where you installed the package b.

           example:

            csh> setenv /usr/local/cuda/bin:$PATH
            csh> setenv LD_LIBRARY_PATH /usr/local/cuda/lib64:$LD_LIBRARY_PATH

             sh> export PATH=/usr/local/cuda/bin:$PATH
             sh> export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

        4) Run the package c, and follow its instruction to install the SDK
           (Root permission is not necessary).

      For complete procedure of installation of CUDA environment,
      please refer to NVIDIAs documents.

(2) Check the Functionality of CUDA G5/G6

    After the installation is completed, you can run the following
    command to performs some many-body simulations and compare the
    results with precalculated ones.

        $pkgroot/script/check_cuda

(3) Configuration at Boot Time

    Once installation is completed, no initialization procedure is
    necessary at each boot time.


2. Compilation and Linkage
==========================

In order to utilize GRAPE hardwares from your own application
programs, include the header file of the control library into your
program, and then link the library.

The control library provides two different APIs, namely, GRAPE-5
compatible G5 API, and GRAPE-6 compatible G6 API (G6 API does not
support GRAPE-7 and Phantom-GRAPE-5).

In addition to G5 API, G5nb API is provided for GRAPE-7. G5nb API is
an extension of G5 API, which have additional APIs for neighbor
particle detection. For the details of G5 and G5nb APIs, see "G5PIPE
User's Guide" ($pkgroot/doc/g5user.pdf) and "G5nbPIPE User's Guide"
($pkgroot/doc/g5nbuser.pdf), respectively.

The table below shows required header files, libraries and compile
options for each GRAPE hardware:

-----------------------------------------------------------------------------------------------------------
GRAPE              API   header     library                                                 compile option
-----------------------------------------------------------------------------------------------------------
GRAPE-DR           G6    g6util.h   libgdr6.a          libhib.a            libm.a
                   G5    g5util.h   libgdr5.a          libhib.a            libm.a

GRAPE-7            G5    g5util.h   libg75.a           libhib.a            libm.a
                   G5nb  g5nbutil.h libg75nb.a         libhib.a            libm.a

GRAPE-6A           G6    g6util.h   libg6a6.a                              libm.a
                   G5    g5util.h   libg6a5.a                              libm.a

GRAPE-6BX          G6    g6util.h   libg6bx6.a         libg6bxhib.a        libm.a
                   G5    g5util.h   libg6bx5.a         libg6bxhib.a        libm.a

Phantom-GRAPE-5    G5    g5util.h   libpg55.a                              libm.a

CUDA G5/G6         G5    g5util.h   libcuda5.a       libcudart.so(注1) libm.a  libstdc++    -pthread
                                    libcuda5s.a(注2)
                   G6    g6util.h   libcuda6.a       libcudart.so      libm.a  libstdc++    -pthread
-----------------------------------------------------------------------------------------------------------
Note1 : libcudart.so should be found at $cudapath/lib, where $cudapath is the path
        you installed NVIDIA's CUDA Toolkit into.

Note2 : libcuda5s.a adopts single precision (32-bit floating-point) format
        as numerical expression. libcuda5.a adopts pseudo-double precision
        (64-bit addition, 32-bit multiplication).

Examples of Options Switches Passed on to the Compiler

    An application program foo.c utilize GRAPE-DR via G6 API:
    cc -o foo foo.c -L$pkgroot/lib -I$pkgroot/include -lgdr6 -lhib -lm

    An application program foo.c utilize GRAPE-7 via G5 API:
    cc -o foo foo.c -L$pkgroot/lib -I$pkgroot/include -lg75 -lm

    An application program foo.c utilize NVIDIA's CUDA device via G5 API:
    cc -pthread -o foo foo.c -L$pkgroot/lib -L$cudapath/lib -I$pkgroot/include -lcuda5 -lcudart -lm -lstdc++


3. Environment Variables
========================

For GRAPE-DR, GRAPE-7, Phantom-GRAPE-5 and CUDA G5/G6, the behavior of
the libraries can be controled by the following environment
variables. These variabel are not valid for GRAPE-6A nor GRAPE-6BX.

3.1 GDEVICE : Assignment of Calculation Resources
-------------------------------------------------

If you have a system with multiple cards installed, by default, the
GRAPE control library functions use all of them. In order to modify
this behavior, you can set a list of device IDs to an environment
variable GDEVICE. If the list is set, GRAPE control library functions
use the listed cards only. For example,

    csh> setenv GDEVICE "0 2 3"

     sh> export GDEVICE="0 2 3"

would indicate the cards with device ID 0, 2, and 3 should be used.
This environment variable might be useful, when you share your system
with someone else.

In the case of GRAPE-DR mode1800/2000/4000, GRAPE-7 model 800 and some
CUDA devices (e.g. GeForce GTX 295), multiple LSI chips on a single
card have device IDs different from each other, and therse chips can
be assigned to different simulations. For example, on a system with
one GRAPE-DR model 1800 installed, you can set

    csh> setenv GDEVICE "0 2"

     sh> export GDEVICE="0 2"

in order to run a simulation on two GRAPE-DR chips with device ID 0
and 2.  You can run another simulation simultaneously, using chips
with device ID 1 and 3.


3.2 GWARNLEVEL : Warning Message Control
----------------------------------------

Controls the warning message output.  The variable can be set to 0, 1,
2, or 3. The larger number indicates the more verbose outputs. The
number 1 or 2 is recommended for normal operation.  The number 3 would
be nice for debugging purpose. The default value is 2. The number 0
suppresses all but fatal error messages.  The variable should not be
set to 0 when you run the functionality check script
($pkgroot/script/check). Otherwise it would fail.


4. Sample Programs
==================

You can find sample programs in $pkgroot/sample/ directory.  These
programs can utilize different GRAPE hardwares by linking different
libraries. Note that, however, not all programs support all
hardwares.

For example, a sample program for many-body simulation in the
direct-summation algorithm is stored in $pkgroot/sample/direct/
directory. When you run the installation script, executables for
various types of GRAPEs are generated: direct_gdr, direct_g7,
direct_g6bx, direct_pg5 and direc_cuda (which are for GRAPE-DR,
GRAPE-7, GRAPE-6BX, Phantom-GRAPE-5 and CUDA G5, respectively). But no
executable for GRAPE-6A would be generated.

In order to build sample programs by yourself, you can:

  - run a script '00recompile' located in a directory for each sample program.

  or,

  - use 'make' command. By default, a Makefile is initially set up for
    GRAPE-DR. For other architectures, you need to edit it by hand. A
    brief instruction can be found at the top of the Makefile.

5. Compatibility
================

As a general rule, G5 API and G6 API provide functions compatible with
GRAPE-5 and GRAPE-6, respectively. For some GRAPE models, however, the
APIs are not fully compatible. Some functions are not supported, some
are restricted. Such functions are summarized below:

G5 API Compatibility
-----------------------------------------------------------------------------------------------------------------------
GRAPE Models        Precision    Potential    Cutoff                 Neighbor Particle    Number of    Size of
                    Equivalent   Calculation  Calculation            List Creation        Pipelines    Particle Memory
                    to GRAPE-5   Function     Function               Function             per Device   per Device
-----------------------------------------------------------------------------------------------------------------------
Original GRAPE-5    -            Yes          Variable               Yes                    96          131071
GRAPE-DR            Yes          Yes          Fixed for P3M Method   No                    256         4194304
GRAPE-7             Yes          No           Fixed for P3M Method   Yes                  20-120       4095-24570
GRAPE-6A            Yes          Yes          Fixed for P3M Method   Yes                    48           65536
GRAPE-6BX           Yes          Yes          Fixed for P3M Method   Yes                    48          262144
Phantom-GRAPE-5     No           No           No                     No                      4           65536
CUDA G5             Yes          Yes          Fixed for P3M Method   No                   8192         1048576
-----------------------------------------------------------------------------------------------------------------------

G6 API Compatibility
------------------------------------------------------------------------------------------
GRAPE Models        Neighbor Particle    Nearest-Neighbor   Number of    Size of
                    List Creation        Particle Search    Pipelines    Particle Memory
                    Function             Function           per Device   per Device
------------------------------------------------------------------------------------------
GRAPE-DR            No                   Yes                 256         1048576
GRAPE-6A            Yes                  Yes                  48           65536
GRAPE-6BX           Yes                  Yes                  48          262144
CUDA G6             No                   Yes                8192         1048576
------------------------------------------------------------------------------------------

6. Additional API for CUDA G5
=============================

6.1 Support for Multiple Walks Method
-------------------------------------

CUDA G5 supports the Multiple Walks Method, an algorithm to improve
the performance of the Barnes-Hut Treecode on GPUs.

In the conventional (but modified for GRAPE) Treecode, forces from one
group of j-particles to one group of i-particles are calculated by the
GRAPE.  And then the results are sent back to the host computer. This
procedure is repeated untila all forces for all i-particles are obtained.

On the otherhand, in the case of the Treecode that adopts the Multiple
Walks Method, multiple combinations of j-particle groups and
i-particle groups are posted to the GPU and handled simultaneously. By
doing this, we can take full advantage of arithmetic units on the GPU.
Also, the efficiency of data transfer between the host computer and
the GPU is improved (See [1] for the detail).

The Multiple Walks Method [1] is an algorithm named and integrated
into the Barnes-Hut Treecode on GPUs by Dr.Hamada (Nagasaki
University) and Dr.Nitadori (RIKEN). A many-body simulation perormed
using the algorithm won the Gordon Bell Prize in 2009.

CUDA G5 provides a new API g5_flush_runs() and g5_flush_runsMC() to
support the Multiple Walks Method. The following shows a typical
procedure to perform force calculation using these APIs:

(1) Set the environment variable G5_MULTIWALK to 1.

(2) Perform force calculation loop Nwalk times, (i.e., set
    j-particles, set i-particles, start the run, and get the results),
    using the conventional G5 API. Here, Nwalk is the number of pairs
    of i-particle groups and j-particle groups, which are posted to
    the GPUs at once.

(3) Call g5_flush_runs() (or g5_flush_runsMC()).

  Example:

    // (a) force calculation for nwalk pairs of i-paritcle groups and j-particle groups.
    for (w = 0; w < nwalk; w++) {
        g5_set_jp(0, n[w], mj[w], xj[w]);
        g5_set_eps2_to_all(eps*eps);
        g5_set_n(n[w]);
        for (off = 0; off < n[w]; off += npipe) {
            if (off + npipe > n[w]) {
                ni = n[w] - off;
            }
            else {
                ni = npipe;
            }
            g5_set_xi(ni, (double (*)[3])xj[w] + off);
            g5_run();
            g5_get_force(ni, (double (*)[3])(a[w] + off), p[w] + off);
        }
    }

    // (b) nwalk pairs are posted to the GPU.
    g5_flush_runs();

    (a) When the environment variable G5_MULTIWALK is set to 1,
        g5_set_jp(), g5_set_xi() and g5_get_force() do not perform
        actual data transfer. They push transfer requests to an
        execution queue prepared in the CUDA G5 library. Similarly,
        g5_runs() does not perform force calculation. It pushes a
        calculation request to the queue.

    (b) At the point g5_flush_runs() is invoked, transfer and
        calculation requests stored in the queue are processed one by
        one, and then the results are retrieved from the GPU.

A sample program $pkgroot/sample/direct/multiwalktest.c shows the
usage of the new API.  It performs multiple many-body simulations
simultaneously, using direct-summation algorithm adopting the Multiple
Walks Method.

  Example:

    Run the program as follows:

        multiwalktest pl1k pl2k pl4k 

    This will perform three different simulations for particle
    distributions pl1k, pl2k and pl4k.

A sample program using the Barnes-Hut Treecode adopting the Multiple
Walks Method is in preparation.

7. Additional API for CUDA G6
=============================

7.1 Optimization for J-Paricle Transfer
---------------------------------------

CUDA G6 provides a mechanism to improve the performance of
the individual-timestep code on GPUs.

In the conventional individual-timestep code on a GRAPE, j-particles are
sent one by one from the host computer to discontinuous addresses of
the particle memory on the GRAPE.

The method above is not optimal for GPU. Following the procedure
below, you can send all j-particles to a continuous region of the
particle memory by a single transfer.  This would reduce the overhead
of the transfer, and improve the performance.

(1) Set the environment variable G6_JPSORTED to 1.

(2) Send all j-particles to a continuous address starting from the top
    of the particle memory, in the ascending order of their time,
    i.e., invoke g6_set_j_particle() Nupdate times, starting from the
    oldest particle with address 0, incrementing the address one by
    one until it reaches (Nupdate-1). Here, Nupdate is the number of
    j-particles whose positions are updated in the previous timestep.

In order to send j-particles following the procedure (2), particles on
the host computer must be sorted in the ascending order of their time.
You can find how to do this in a sample program
$pkgroot/sample/s8/sticky8.c.  Its performance should be improved when
you set the environment variable G6_JPSORTED to 1.

Note : If the environment variable G6_JPSORTED is set to 1 and
       j-particles are sent not following the procedure (2), the
       calculation result would be incorrect.


8. Tested Platforms
===================

    Fedora Core 5,10,11,13 x86_64
    CentOS 5.4 x86_64

9. References
=============

[1] T. Hamada, R. Yokota, K. Nidadori, T. Narumi, K. Yasuoka, M. Taiji
    "42 TFlops Hierarchical N-body Simulations on GPUs with
    Applications in both Astrophysics and Turbulence", SC09 (ACM/IEEE) 2009.


10. License and Copyright
========================

The MIT software license (see below) is applied to the GRAPE Software
Package (hereafter "the Software"), unless otherwise mentioned.

Files in $pkgroot/driver/ directory and $pkgroot/g6a/pcimem directory,
to which GNU General Public License (hereafter GPL) is applied.

Redistribution of files in $pkgroot/cuda directory is prohibited.

The copyright of the software belongs to K&F Computing Research
Co. (hereafter KFCR), except for the following files:

The copyright of the Phantom-GRAPE-5, that is, files under
$pkgroot/pg5/ directory, except for $pkgroot/pg5/phantom_g5mc.c,
belong to Keigo Nitadori (RIKEN).  The copyright of
$pkgroot/pg5/phantom_g5mc.c belongs to KFCR.

The copyright of files under $pkgroot/g6a/ directory and
$pkgroot/g6bx/ directory belong to Toshiyuki Fukushige (KFCR), except
for files under $pkgroot/g6a/pcimem/ directory and
$pkgroot/g6bx/pcimem/ directory, of which copyright belong to Atsushi
Kawai (KFCR).  The copyright of files under $pkgroot/driver/ directory
also belong to Atsushi Kawai (KFCR).

-------------------------------------------------------------------------------

The MIT Software License:

Copyright (c) 2009-, K&F Computing Research Co.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

-------------------------------------------------------------------------------


11. Acknowledgement
==================

K^&F Computing Research Co. would like to thank the following people for
help in development of the GRAPE Software Package: Keigo Nitadori (RIKEN).


12. Modification History
=======================

--------------------------------------------------------------------------------------------------
version  date         author            note
--------------------------------------------------------------------------------------------------
1.3.5    14-Jul-2012  AK                [GRAPE-9] APIs for potential calculation, the neighbour
                                        search, the nearest neighbour search, cutoff function,
                                        added to G6 library.

1.3.0    12-Feb-2012  AK                GRAPE-9 supported (G6 API).

1.2.4    02-Jan-2012  AK                [CUDA G5/G6] CUDA4.0 supported.

1.2.3    25-Oct-2011  TF, AK            [GRAPE-DR] the nearest neighbor search APIs added
                                        to G5 library.

1.2.1    14-Dec-2010  AK                [CUDA G5/G6] CUDA3.2 supported.

1.2.0    22-Jun-2010  AK                An optional package CUDA G5/G6 added.

1.1.4    12-Mar-2010  TF                Fixed a bug on G6 library for GRAPE-DR (libgdr6.a).

1.1.3    23-Dec-2009  TF                Support for new control logics of GRAPE-DR model2000, model460.
                                        Cutoff function added to the G5 API of GRAPE-DR.
                                        G6 API of GRAPE-DR modified to maintain backward compatibility.

1.1.2    28-Sep-2009  TF                Package management utility improved.

1.1.1    19-Sep-2009  AK                English documents added.

1.1.0    17-Sep-2009  AK, TF            Support for GRAPE-DR model2000, model460.

1.0      17-Jul-2009  A. Kawai,         Document created. The package is build based on gdrpkg0.32,
                      T. Fukushige      g7pkg2.2.1, g6apkg1.1, g6bx, and
                                        phantom_limited_accuracy_080110.
--------------------------------------------------------------------------------------------------

13. Contact
=========

Contact address for questions and bug reports:
K&F Computing Research (support@kfcr.jp)
