Changelog
Source:NEWS.md
Rduckhts 1.1.6.9000-0.0.2 (Development version)
- Fix Wasm package builds under
rwasm/ r-universe: the packageconfigurescript now preserves injectedNAME=VALUEcache overrides, forwards explicit--build/--hosttriplets into the vendoredhtslib./configure, forwards webR’s Emscripten port flags forzlib/bzip2, seeds wasm-safe Autoconf cache results forzlib/bzip2/socket probes, injects a tiny Emscripten-only socket compatibility shim forrecv/send/closesocket, and disables only the optionalhtslibfeatures that are not available in the stock webR/r-universe wasm toolchain (libcurl,S3,GCS,lzma, plugins); this fixes the originalac_cv_func_getrandom=no: command not foundfailure and the subsequent nestedhtslibcross-compile probe failures without changing native configure behavior. - Fix bundled extension wasm artifacts: the upstream CMake wasm build now rebuilds
libduckhts.aas a fat archive containing vendoredhtslib(and any static archive dependencies CMake can see), so DuckDB wasm packaging no longer depends onextension-ci-toolschanges just to avoid unresolved symbols such asbcf_readrecatLOAD.
Rduckhts 1.1.6-0.0.2 (2026-04-09)
CRAN release: 2026-04-09
- Fix
test_bam_file_offset: castCOUNT(*)results toINTEGERin SQL so the DuckDB driver returns Rintegerrather thannumeric(BIGINT maps to double in the duckdb R driver), restoringexpect_identicalassertions.
Rduckhts 1.1.6-0.0.1 (2026-04-09)
- Fix bundled
read_hts_index_spans(...)/rduckhts_hts_index_spans(): the span view now returns real chunk rows from CSI/TBI/BAI indexes, including populatedbin,chunk_beg_vo,chunk_end_vo,chunk_bytes,seq_start, andseq_endvalues instead of placeholderNAs; BCF-backed calls also avoid the previous noisytbxprobe warning on.csiindexes. - Add
FILE_OFFSETcolumn torduckhts_bam()/read_bam(...): exposes the BGZF virtual file offset after each record. Zero runtime overhead (macro over already-open struct fields). EnablesORDER BY FILE_OFFSETin SQLLAG()/LAST_VALUE()window functions to reproduce exact BAM file order for streaming deduplication algorithms. Together with the//integer-division operator andLAST_VALUE(... IGNORE NULLS), this permits exact replication of WisecondorX’s larp/larp2 state machine in pure SQL, confirmed at 0 mismatches across 25,115 non-zero bins on a real NIPT BAM.
Rduckhts 1.1.5-0.0.1 (2026-04-08)
- Fix bundled
bcftools_liftover(...)/rduckhts_liftover()cache and realignment hardening: per-thread chain/FASTA contexts are now bounded instead of accumulating for the lifetime of worker threads, and scalar left-alignment no longer reuses stale traceback state after failed/empty alignments. - Fix bundled
read_bam(...)/rduckhts_bam()andread_bcf(...)/rduckhts_bcf()indexed parallel full scans when headers contain leading empty contigs: contig claiming now retries iteratively instead of recursively, and the BAM reader no longer returns an empty chunk after successfully handing off to the next contig. - Keep the top-level extension
README.Rmdexamples aligned with direct extension usage: the extension README now renders its example queries through a custom DuckDB SQL knitr engine instead ofR/DBI, and its liftover example uses bundled fixtures rather than temporaryR-generated FASTA/chain files. - Fix bundled Windows GNU CMake builds: the vendored
htslibconfigure step now distinguisheswindows_amd64_mingwfromwindows_amd64_rtools; the MinGW path keeps the smallerconfigure.win-style library set, while the Rtools path restores the fuller staticlibcurldependency closure required by itshtslibfeature probes.CURL_STATICLIBremains on the built objects rather than on./configuretest probes. - Fix bundled Windows
windows_amd64_rtoolsCMake builds: the upstream extensionMakefilenow pinsCC/AR/RANLIBfromR CMD config, avoiding mixed non-Rtools compiler and Rtools library selection when vendoredhtslibis configured; the vendoredhtslibCMake path also returns to separate configure/build steps on MinGW for simpler diagnostics and behavior, and MinGW static-libcurl builds now defineCURL_STATICLIBto match Rtoolslibcurl.a. - Fix bundled
read_bcf(...)/rduckhts_bcf()mapping of fixed-count INFO/FORMAT arrays: exact-cardinality fields such asNumber=2andNumber=4now materialize as DuckDB array/list columns instead of silently dropping all but the first value. - Fix bundled
read_bcf(...)/rduckhts_bcf()handling of string FORMAT lists such as DRAGENFORMAT/LAA:Number != 1string FORMAT fields now materialize asVARCHAR[]instead of triggering DuckDB internal assertion failures. - Fix bundled
duckdb_munge(...)/rduckhts_munge()multithreaded FASTA lookups: FASTA index handles are now thread-local and FASTA fetches are synchronized inmunge, avoiding intermittentfai_retrievefailures and aborts whenfasta_refis used withPRAGMA threads > 1. - Add
rduckhts_score(): polygenic risk score computation backed by thebcftools +scoreplugin, supporting GT/DS/HDS/AP/GP/AS dosage modes, all major GWAS summary presets (PLINK, PLINK2, REGENIE, SAIGE, BOLT, METAL, PGS, SSF/GWAS-SSF), GWAS-VCF multi-PRS scoring, p-value thresholding, sample subsetting, and region/filter controls. - Add
rduckhts_munge(): GWAS summary statistics normalization backed bybcftools +munge, with FASTA reference allele resolution, swap-aware effect/frequency transforms, and METAL meta-analysis column support. - Add
rduckhts_liftover(): variant coordinate liftover backed bybcftools +liftoverusing UCSC chain files, with full indel normalization, INFO/END lifting, and MT passthrough. - Add
rduckhts_bed()for BED3–BED12 interval files andrduckhts_fasta_nuc()for nucleotide composition over BED intervals or fixed-width bins. - Add compression and index helpers:
rduckhts_bgzip(),rduckhts_bgunzip(),rduckhts_bam_index(),rduckhts_bcf_index(), andrduckhts_tabix_index(). - Add HTS metadata readers:
rduckhts_hts_header(),rduckhts_hts_index(),rduckhts_hts_index_spans(), andrduckhts_hts_index_raw(). - Add quality encoding controls to
rduckhts_bam()andrduckhts_fastq()(quality_representation,input_quality_encoding) andrduckhts_detect_quality_encoding()for heuristic FASTQ encoding detection. - Add
sequence_encoding := 'nt16'parameter torduckhts_bam(),rduckhts_fasta(), andrduckhts_fastq()for raw htslib nt16 sequence output asUTINYINT[]. - Add SAM flag helpers
sam_flag_bits()andsam_flag_has(), CIGAR utility functions, andis_forward_aligned(). - Bundle duckhts 1.1.5 extension.
Rduckhts 0.1.3-0.0.2
CRAN release: 2026-02-24
Conditionaly enable plugins in windows
Updates the configure script to avoid check faillure on CRAN MacOS
Update the extension version to 0.1.3
Rduckhts 0.1.2-0.1.5
- Fixed inadvertant removal of libexec
- Updated the plugin to add header table functions
Rduckhts 0.1.2-0.0.9000
- Different fixes for CRAN submission
- Updated DESCRIPTION Title/Description formatting and added HTSlib reference.
- Removed default write paths in bootstrap/build helpers; now require explicit paths.
- setup_hts_env now accepts an explicit plugins_dir parameter.
- duckhts_build now accepts a make argument (GNU make required).
- modified configure to attemp to support wasm
- Update bootstrapped extension code to match
duckhts0.1.2. - Add SAMtags + auxiliary tag support (standard_tags, auxiliary_tags).
- Add tabix header/typing options (header, header_names, auto_detect, column_types).