Kielce, Poland 2007

gp2fasta

gp2fasta is converting gp files from NCBI GenPept or GenBank format to fasta. Its main purpose is to create fasta files with short, but still accurate headers for sequence.

For example:
>Strpur-115729834-h
PNQILMQFRLDDNGSSYYKELASIIYGASPEFELAIFTVCFKENPNALSTFTMAGGITQKVQTWDYNGGYIGSAYFSV

stands for:
gi: 115729834
organism: Strongylocentrotus purpuratus
additional: h (hypothetical protein)
sequence:PNQILMQFRLDDNGSSYYKELASIIYGASPEFELAIFTVCFKENPNALSTFTMAGGITQKVQTWDYNGGYIGSAYFSV

 
Options:
- for id: GI or LOCUS;
- for organism: e.g. Mus musculus, M.musculeu or Musmus;
- detailed definition;
- additional information:
    P -> PREDICTED
    s -> similar
    h -> hypothetical protein
    u -> unnamed protein product
    n -> novel
    p -> putative
    o -> open reading frame

Each option is separated with "separator" (in this case "-").



Organism: Homo sapiens  H.sapiens   Homsap

ID:  GI   Locus

Genename           Additional          Separator



Example file
Example file2
Mycoplasmoides genitalium G37 (proteome, 603 proteins)
hsp23 (Heat Shock Protein 23, 228 proteins)

Free QT4 version of gp2fasta with GUI


My other projects:



Suggestions and comments please send to: Lukasz P. Kozlowski Author: Lukasz Kozlowski
Last modified: 26.02.2026


top