extract ¶
Extract information of polars.LazyFrame produce from raw vcf file parsing.
genotypes ¶
genotypes(lf: LazyFrame, col2expr: dict[str, Callable[[Expr, str], Expr]], format_str: str = 'GT:AD:DP:GQ') -> LazyFrame
Extract genotypes information of raw polars.LazyFrame.
Only line with format value match format_str
are considered.
Parameters:
-
lf
(LazyFrame
) –The target polars.LazyFrame
-
col2expr
(dict[str, Callable[[Expr, str], Expr]]
) –A dict associate column name and function to apply to create polars.LazyFrame column (produce by io.vcf.format2expr)
-
format_str
(str
, default:'GT:AD:DP:GQ'
) –Only variants match with this string format are considered
Returns:
-
LazyFrame
–A polars.LazyFrame with variant id, sample information and genotypes information
Raises:
-
NoGenotypeError
–If none of the lf columns is equal to 'format'
Source code in src/variantplaner/extract.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
merge_variants_genotypes ¶
merge_variants_genotypes(variants_lf: LazyFrame, genotypes_lf: LazyFrame, sample_name: list[str]) -> LazyFrame
Merge variants and genotypes polars.LazyFrame.
Parameters:
-
variants_lf
(LazyFrame
) –lazyframe with variants, column: (id, chr, pos, ref, alt).
-
genotypes_lf
(LazyFrame
) –lazyframe with genotypes, column: (id, sample, [genotype column]).
Returns:
-
LazyFrame
–A lazyframe with all data
Source code in src/variantplaner/extract.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
variants ¶
variants(lf: LazyFrame) -> LazyFrame
Extract variants only information of polars.LazyFrame.
Parameters:
-
lf
(LazyFrame
) –
Returns:
-
LazyFrame
–A polars.LazyFrame with just variant information (id, chr, pos, ref, alt)
Source code in src/variantplaner/extract.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|