28 November 2009 CVS - Concurrent Versioning System "CVS (Concurrent Versions System) is a version control system that can record the history of your files (usually, but not always, source code). CVS only stores the differences between versions, instead of every version of every file you have ever created. CVS also keeps a log of who, when, and why changes occurred. CVS is very helpful for managing releases and controlling the concurrent editing of source files among multiple authors. Instead of providing version control for a collection of files in a single directory, CVS provides version control for a hierarchical collection of directories consisting of revision controlled files. These directories and files can then be combined together to form a software release." from http://www.cvshome.org/eng/ Other CVS Documentation Version Management with CVS by Per Cederqvist Want to use CVS to check out the NCBI C toolkit directly from NCBI? See the NCBI C++ Toolkit manual Want to set up a CVS server with the NCBI toolkit? See http://infrastructure.blueprint.org Installing CVS on Centos (client) as root: yum install cvs.i386 QUICK DOCUMENTATION? man cvs or cvs --help CVS Source Code Retrieval for NCBI Toolkit
Public Read-only Access (cannot check-in)
HOW? On Centos edit the .bash_profile file in your home directory
Near your export NCBI line, add the line export CVSROOT=:pserver:anoncvs@anoncvs.ncbi.nlm.nih.gov:/vault then type the command source .bash_profile
MAIN CVS COMMANDS checkout: fetch a working copy of repository code / directories cvs --help checkout Lists checkout options
cvs --help import
cvs --help add cvs --help update This puts your changes back in the repository for other users. cvs --help commit cvs commit -m "Added new library code seqfast" Source Code Inspection commands... log: Changes to the file: Who has change what, when and how cvs -- help log diff: Show the differences between two versions of files cvs --help diff NOTE these are better done with web-based cvs repository browsers: http://www.ncbi.nlm.nih.gov/viewvc/cvs/ncbi/ ViewVC - a Web Based CVS or Subversion Browser Graphical CVS Clients on Windows/Mac/Linux:http://www.wincvs.org/ http://www.tortoisecvs.org/ Setting Environment Variables on Windows (from NCBI C++ tk manual)
|
GDB - Gnu Debugger
"GDB, the GNU debugger, allows you to debug programs written in C, C++, Java, and other languages, by executing them in a controlled fashion and printing their data." http://gnu.org/software/gdb/ Manual - Debugging with GDB GDB COMMANDS CHEAT SHEET:
GDB is powerful software with many commands. Three of the most often used debugger commands are: 1a. Stopping execution at a specific line of code (or even fancier, when a specific condition ocurrs). This is called a breakpoint. Look at the line number in your executable - e.g. to stop at line 140. You can pass this into gdb on the command line. 1b. Stopping execution where something goes wrong. Either you did this already and you have code with a Segmentation Fault or other run-time error, or you can intentionally do this with something very very bad like: int bad_array[5]; printf ("%s",bad_array[8]); /* this is out of bounds for the array - C won't warn you */ /* and printf will not like to print this uninitialied int as a string - this will either print out garbage or create a classic Segmentation Fault */ 2. Printing variable values. Once your program has stopped (or while running), you can see inside variables! Just know which one you want to print out - e.g. myargs[0].intvalue so you can pass it into gdb on the command line. Even fancier, you can change variable values while the program has stopped and see what happens. 3. Figure out what functions have been called already (stack trace). This tells you which functions have been called (including library functions) before your program stopped (either by a breakpoint or by some run-time error). PRACTICAL EXAMPLE STEP 1. Installing GDB on Centos as root: yum install gdb.i386 STEP 2. Setting up your makefiles for debugging. In order for GDB to work, you have to compile code with an optional flag gcc -g This embedds information inside the executable so that the debugger knows what source code line number each bit of executable code comes from, and what the variable names are. This makes the debug versions of your code larger, too. Since all your compiling is controlled with make files, you must first change the make files. For this example we will use the fetchseqs.c and libseqfast.a in debug mode. Edit the two makefiles: make.fetchseqs and make.seqfast. Find the line: #OPTFLAG -g and uncomment it. OPTFLAG -g and save both makefiles - they are now set to compile in debug mode. NOTE this also removes the OPTFLAG -03 which turns off built-in compiler optimizations. If your program works in debug mode suddenly (no bug) it is usually because of this change. You may have code that does not optimize properly. Remove all the *.o and *.a files from your fetchseqs directory so that the compiler will be forced to replace them with debug versions. Compile with: make -f make.seqfast
make -f make.fetchseqs You should see the -g option in the compiler output. Now you should have a new, larger version of fetchseqs, with debug information embedded within. Now we can use gdb. NOTE only the code compiled with -g is visible to the debugger. If the bug is in some other library - you cannot step through it line by line. In other words, you have to recompile any library you want to debug, otherwise they will be "silent". That is why we have recompiled both the source code and library in this case. STEP 3. Start the Debugger Start gdb from the command line passing it the name of your executable. > gdb fetchseqs GNU gdb Fedora (6.8-37.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"... (gdb) The (gdb) prompt offers a multitude of commands. You can type help and it will give you a list of secondary commands (gdb) help List of classes of commands: aliases -- Aliases of other commands breakpoints -- Making program stop at certain points data -- Examining data files -- Specifying and examining files internals -- Maintenance commands obscure -- Obscure features running -- Running the program stack -- Examining the stack status -- Status inquiries support -- Support facilities tracepoints -- Tracing of program execution without stopping the program user-defined -- User-defined commands Type "help" followed by a class name for a list of commands in that class. Type "help all" for the list of all commands. Type "help" followed by command name for full documentation. Type "apropos word" to search for commands related to "word". Command name abbreviations are allowed if unambiguous. (gdb) Try this: (gdb) run - NOTE: The argument passed to fetchseqs is now "-" To lock in arguments, you can use the command set args Here is what you should see - the program running and printing out the argument list. Starting program: /home/chogue/readseqs/fetchseqs - [Thread debugging using libthread_db enabled] [New Thread 0xb7fdd6c0 (LWP 27924)] FetchSeqs arguments: -g Single GI number [Integer] Optional default = 0 -i Input File List of GI or Accessions, one per line [File In] default = NULL -o Output File [File Out] default = stdout -a Single Accession Code [String] Optional default = NULL -q Quiet Mode (T/F) [T/F] Optional default = F -d Database To Use [Data In] Optional default = pdbaa.faa -r Report ONLY (0) FASTA Files (1) FASTA Definition Lines (2) Accession Codes (3) GI numbers [Integer] Optional default = 0 Program exited with code 01. (gdb) OK, so far no bugs... SO - The three most useful gdb commands are: 1. run -args- Runs the program (to completion if no bug, breakpoint or condition). Pass the arguments here after the run command! 2. break myprog.c:140 Sets the breakpoint to line 140 in file myprog.c making the run command stop there. Shorhand is just the letter b 3. where Prints out the stack trace once the program is stopped. Other commands: continue or 'c' Proceeds to next breakpoint or to the end of the program print -var- Prints the variable in current scope. Shorthand is letter p. Examples: print i print a[3] p myargs[0].intvalue next Executes the next command in the program step Executes the next function (step moves more than next) Try the following: RUN the program WITH a BOGUS GI as argument, Set a BREAKPOINT at line where GetArgs is called. Look up the line number in fetchseqs.c
RUN it again with the BOGUS GI. Print the value of the GI in the variable myargs[0].intvalue (gdb) run -g 1234 Starting program: /home/chogue/readseqs/fetchseqs -g 1234 [Thread debugging using libthread_db enabled] [New Thread 0xb7efa6c0 (LWP 27943)] [fetchseqs] ERROR: GetSeqByGI: GI was not found in database. (null) Program exited normally. (gdb) b fetchseqs.c:[USE GETARGS LINE NUMBER HERE] Breakpoint 1 at 0x8049d11: file fetchseqs.c, line 145. (gdb) run -g 1234 Starting program: /home/chogue/readseqs/fetchseqs -g 1234 [Thread debugging using libthread_db enabled] [New Thread 0xb7f456c0 (LWP 27945)] Breakpoint 1, Nlm_Main () at fetchseqs.c:145 145 f = stdout; (gdb) print myargs[0].intvalue $1 = 1234 (gdb) STEP 4. Try a BUGGY version of fetchseqs.c Download the attached file fetchseqs_bug.c Rename the good version of fetchseqs.c >mv fetchseqs.c fetchseqs_good.c Compare the two version with the Unix "diff" command, so you can see what line numbers are different. >diff fetchseqs_bug.c fetchseqs_good.c 136d135 < int Bad_array[5]; /* THIS IS the bug */ 145,148d143 < printf("%s",Bad_array[8]); < /* This will SEG FAULT - consequence of randomly assigning some unitialized block of memory to a string-handling statement */ Good idea to set your break point here, at line 145. Copy the buggy version to fetchseqs.c and compile it with the debug version of the make.fetchseqs makefile. Try running this version with no arguments - it should report a Segmentation fault error (or something like that) because the error we put in is at the very start of the code. Then try running it under gdb. Once the code stops try the where command and you will see the function trace (like the bottom of the session below). [chogue@localhost readseqs]$ ./fetchseqs Segmentation fault [chogue@localhost readseqs]$ gdb fetchseqs GNU gdb Fedora (6.8-37.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"... (gdb) run Starting program: /home/chogue/readseqs/fetchseqs [Thread debugging using libthread_db enabled] [New Thread 0xb7f3f6c0 (LWP 27990)] Program received signal SIGSEGV, Segmentation fault. 0x008c1fab in strlen () from /lib/libc.so.6 (gdb) where #0 0x008c1fab in strlen () from /lib/libc.so.6 #1 0x008941ff in vfprintf () from /lib/libc.so.6 #2 0x008999c3 in printf () from /lib/libc.so.6 #3 0x08049d0b in Nlm_Main () at fetchseqs.c:145 #4 0x082f6908 in main () (gdb) What does this mean? The standard C library function strlen() died. WHY? It could not find the '\0' terminator at the end of the string you told it to print. That is because it was handed unitialized memory (whoops!) This is a very common problem in C code. STEP 5. Try debugging the good version, stepping through it, listing the code around each step, printing variable values. For ordinary C code you can set a breakpoint at main with (gdb) b main For NCBI Toolkit code you need to use this: (gdb) b Nlm_Main Here is a session - note the list (l), next (n) and step (s) commands. [chogue@localhost readseqs]$ gdb fetchseqs GNU gdb Fedora (6.8-37.el5) Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"... (gdb) b Nlm_Main Breakpoint 1 at 0x8049ca5: file fetchseqs.c, line 128. (gdb) run -g 1234 Starting program: /home/chogue/readseqs/fetchseqs -g 1234 [Thread debugging using libthread_db enabled] [New Thread 0xb7ff36c0 (LWP 28082)] Breakpoint 1, Nlm_Main () at fetchseqs.c:128 128 ReadDBFILEPtr rdbfp=NULL; (gdb) l 123 } 124 125 126 Int2 Main(void) 127 { 128 ReadDBFILEPtr rdbfp=NULL; 129 CharPtr seq=NULL; 130 Boolean start=TRUE; 131 FILE *f; 132 FILE *fin; (gdb) n 129 CharPtr seq=NULL; (gdb) n 130 Boolean start=TRUE; (gdb) n 133 Int4 gi=0; (gdb) n 134 ValNodePtr pvnList = NULL; /* This linked list will hold the GI numbers parsed from the input file */ (gdb) n 135 ValNodePtr pvnHere = NULL; (gdb) n 138 if ( !GetArgs("FetchSeqs", NUMARGS, myargs) ) return 1; (gdb) p myargs[0].intvalue $1 = 0 (gdb) n 144 if (!StringCmp(myargs[arg_output_filename].strvalue,"stdout")) { /* this test returns 0=FALSE if they match */ (gdb) p myargs[0].intvalue $2 = 1234 (gdb) s 145 f = stdout; (gdb) s 159 rdbfp=OpenProteinFastaDB(myargs[arg_database].strvalue); /* This library function reports mislabeled file errors by itself */ (gdb) continue (gdb) quit Final thoughts. GDB supports many languages C and C++ Objective-C Fortran Pascal Modula-2 Ada Yes you can run software in reverse. set exec-direction reverse to go forwards again set exec-direction forward AND IF YOU LIKE the GNU DEBUGGER - TRY THE GNU PROFILER: When you compile with -gp the gnu profiler can tell you how much time your code spends in each function. Use this to find time-wasting problems in your code. http://www.ibm.com/developerworks/library/l-gnuprof.html http://sourceware.org/binutils/docs-2.16/gprof/ |
Christopher Hogue's Research > RCE in Mechanobiology Advanced Bioinformatics Software Development Workshop >