RFCGR logo

School of Biological & Chemical Sciences

The Fugu Informatics Network

Fugu rubripes

A higher order assembly of the Fugu genome

Tanya Vavouri1, Yvonne JK Edwards2, Martin Goodson, Greg Elgar

1tvavouri@rfcgr.mrc.ac.uk, MRC RFCGR; 2yjedward@rfcgr.mrc.ac.uk, MRC RFCGR



The pufferfish Fugu rubripes has a compact genome of approximately 365 megabases(Mb). In 2002, the whole-genome shotgun (WGS) assembly (release v.2.0) of this vertebrate was published in Science (Aparicio et al, 2002). Due to the nature of the WGS approach, at this stage the non-repetitive part of the genome was contained in 12,403 unordered scaffolds ranging in size between 2 and 650 kilobases (Kb). Shortly afterwards a new assembly (v.3.0) was constructed resulting in 8,023 scaffolds ranging in size between 2 and 1,100Kb. Manual comparison of the two sets of scaffolds revealed that in many cases they were complementary. Therefore we used the long-range information contained in both of them to construct a more contiguous assembly. Following a semi-automated approach we have ordered and oriented 2716 scaffolds from release v.3.0 into 1023 "Assemblies" spanning regions between 8Kb and 1,500Kb. We then used the available BAC and cosmid end data to link scaffolds spanning even longer distances. Our current release of this higher order assembly contains 2345 scaffolds ordered and oriented in 436 "SuperAssemblies" containing approximately 201 Mb of the genome. A further 44Mb is contained in 474 Assemblies that have not been linked to any other scaffolds either because none of the BAC or cosmid ends link them or because the available linking information don't pass our stringent criteria. This data is available through the Fugu RFCGR website and can be viewed either statically on a web browser or dynamically through the genome annotation application Apollo. Our assembly allows long range comparisons to be made between Fugu and other vertebrate genomes and has already revealed to us some striking regions of high conservation between Fugu and mammalian genomes.