2014
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads
Li P, Jiang X, Wang S, Kim J, Xiong H, Ohno-Machado L. HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads. Journal Of The American Medical Informatics Association 2014, 21: 363-373. PMID: 24368726, PMCID: PMC3932469, DOI: 10.1136/amiajnl-2013-002147.Peer-Reviewed Original ResearchConceptsBase quality valuesCompression algorithmStorage savingsGenome compressionSequence Alignment/Map (SAM) formatCompression ratioNovel compression algorithmComparable compression ratioCompression mechanismK-means clusteringDifferent reference genomesQuality valuesDecompression qualityLossless compressionExecution timeCompression rateAligned readsMap formatAlgorithmBiomedical communityDifferent quality valuesExperimental datasetsAdaptive schemeStorage capabilityArchiving
2013
DNA-COMPACT: DNA COMpression Based on a Pattern-Aware Contextual Modeling Technique
Li P, Wang S, Kim J, Xiong H, Ohno-Machado L, Jiang X. DNA-COMPACT: DNA COMpression Based on a Pattern-Aware Contextual Modeling Technique. PLOS ONE 2013, 8: e80377. PMID: 24282536, PMCID: PMC3840021, DOI: 10.1371/journal.pone.0080377.Peer-Reviewed Original ResearchConceptsReference-free compressionDisk storage capacityCompression algorithmDecompression costData transferringArt algorithmsCompression performanceFile sizeGenome compressionCompression rateBit rateAlgorithmDNA compressionBiomedical researchersPerformance advantagesGenome dataModeling techniquesContextual modelImportant concernResearch purposesCompressionPerformanceStorage capacityBitsReference sequenceGenome Sequence Compression with Distributed Source Coding
Wang S, Jiang X, Cui L, Dai W, Deligiannis N, Li P, Xiong H, Cheng S, Ohno-Machado L. Genome Sequence Compression with Distributed Source Coding. 2013, 525-525. DOI: 10.1109/dcc.2013.104.Peer-Reviewed Original ResearchEncoder sideSource codingFile sizeLow processing capabilitiesHigh computational complexityLimited communication bandwidthFile size reductionLow-density parity-check (LDPC) decodersCompression frameworkHash codingBandwidth usageCommunication bandwidthParity-check decoderSequence compressionCompression techniquesAdaptive code lengthComputational complexityProcessing capabilitiesSmall storageMemory requirementsGenome compressionFactor graphGenome dataCode lengthCoding