This document presents the new features available in ngklast release 4.4 and KServer 4.2. 1) KLAST search engine optimization ngklast comes with an updated release of the KLAST sequence comparison tool. It is now capable of directly working on BLAST databanks. When considering nucleotide comparison (klastn), the use of these banks, instead of Fasta files, provides an additional x2 speedup. Users of previous releases of ngklast have nothing to do. Indeed, since ngklast provides you with the BLAST tool, the Databank Manager of ngklast always builds BLAST batabanks during an installation. 2) Search Module A/ The Search Module uses a table presentation of query files. Column's header can be used to reorder queries by name and size: B/ A new command has been added to filter query using user-defined criteria: 1
D/ A new command has been added to display the sequence size distribution of a query: C/ A new command has been added to export a query out of ngklast. This command can be useful to retrieve a query file filtered using new ngklast «Filter» command. 3) Project Manager Module A/ A fully new data handling system has been created in order to handle more data in a very efficient way. ngklast data management is now about 10x faster than previous release of ngklast; biological classification data retrieval and management is now about 20x faster than ngklast 4.3. B/ Local KLAST now provides the estimated time and result size during comparison job processing. This information is displayed in the status bar of the software: Comparison of black cottonwood CDS vs. SwissProt on a 4-core imac computer. 4) Databank Manager (KDMS) A/ KDMS now handles sequence redundancy in a new way. When sequence files contain redundant sequence IDs, only one copy of the corresponding sequence will be kept in the databank, and the installation processing won't stop. B/ Some optimizations have been made to speedup the indexing of sequences during databank installation. 2
C/ KDMS provides a new tool to filter sequence files. This new tool is available in the Personal Databank panel of the Databank Manager: It is worth noting that filtered sequence files can then be used to prepare a databank to be used with Blast and KLAST comparison tools. D/ KDMS displays download progress status during sequence retrieval from FTP servers (file size, data transfered over time, remaining time to finish file transfer): E/ KDMS optimizes the installation of single large Fasta file (e.g. NCBI nt and nr databanks), reducing 2x their size on disk, in comparison with previous releases of KDMS. 3
5) Tech notice A/ ngklast is now shipped with a 64bit Java Virtual machine release 1.7 for all platforms. As a consequence, ngklast is now only available as a 64bit application. This major change is due to optimization in data management requiring 64bit operations (see section «Project Manager» above). B/ Important notice for Linux users: by default we provide a KLAST binary compatible with lib C++ version up to 3.4.13. If you use a more recent Linux distribution, you may consider using the more recent KLAST library, that can provide up to 20% additional speedup. How to check your system? Use the following command on the command-line: strings /usr/lib64/libstdc++.so.6 grep "GLIBCXX_[0-9]" (you could have to adapt the path to the std lib C++, depending on your Linux distribution) If the result displays GLIBCXX release 3.4.14 or above, then we invite you to use the fastest binary, as follows: a. enter the directory "external/bin/linux" of ngklast b. rename "libklib.so" to "KLib.so.libc3.4.13" c. rename "libklib.so.lib3.4.14" to "libklib.so" Carefully respects letters case, and ensure that the library has "read permissions". KServer 4.2 new features A/ KServer is now capable of recovering its working status after a server shutdown. B/ KServer does not anymore made use of DRMAA system to submit jobs to a cluster. Instead, the software uses shell scripts offering a more convenient way to schedule jobs. This new feature also offers the possibility to deploy KServer with any type of job scheduling systems. C/ As a result of new «B» feature, KServer is now compatible with LSF, in addition to SGE, OGE and PBS already supported by previous KServer releases. D/ KServer Administration web page displays jobs progress status. E/ KServer stability over large Blast jobs scheduling has been improved. KServer 4.2 tech notice Only for Linux users: by default we provide a KLAST binary compatible with lib C++ version up to 3.4.13. If you use a more recent Linux distribution, you may consider using the more recent KLAST library, that can provide up to 20% additional speedup. How to check your system? 4
Use the following command on the command-line: strings /usr/lib64/libstdc++.so.6 grep "GLIBCXX_[0-9]" (you could have to adapt the path to the std lib C++, depending on your Linux distribution) If the result displays GLIBCXX release 3.4.14 or above, then we invite you to use the fastest binary, as follows: a. enter the directory "external/bin/linux" of KlastRunner application b. rename "libklib.so" to "KLib.so.libc3.4.13" c. rename "libklib.so.lib3.4.14" to "libklib.so" Carefully respects letters case, and ensure that the library has "read permissions". 5