| Title: | Data Frame Fingerprints and Lineage Figures |
|---|---|
| Description: | Profiles R data frames as compact data fingerprints using schema, shape, missingness, distribution, category, uniqueness, time, and role signals. It compares versions, identifies close relatives in a library of historical data sets, and renders portable HTML cards plus static PNG/PDF lineage figures for reports. |
| Authors: | Tony Lu [aut, cre] |
| Maintainer: | Tony Lu <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-06 08:14:12 UTC |
| Source: | https://github.com/tonyisfool/datadna |
A modified version of customers_old with distribution, category,
missingness, and schema changes.
A data frame with 180 rows and 9 columns.
Synthetic data generated for package examples.
A small synthetic customer table used to demonstrate data DNA profiling.
A data frame with 180 rows and 8 columns.
Synthetic data generated for package examples.
Profiles an R data frame into a compact identity object that records schema, shape, missingness, distributions, categories, uniqueness, time signals, and stable fingerprints.
data_dna(df, name = NULL, sample_size = 10000L)data_dna(df, name = NULL, sample_size = 10000L)
df |
A data frame. |
name |
Optional data set name shown on cards and print output. |
sample_size |
Maximum number of rows used for profiling. |
A data_dna object.
demo <- dna_example_customers() dna <- data_dna(demo$customers_new, name = "customers_new") dnademo <- dna_example_customers() dna <- data_dna(demo$customers_new, name = "customers_new") dna
Render a laboratory-style data DNA card.
dna_card(x, file = NULL, open = FALSE)dna_card(x, file = NULL, open = FALSE)
x |
A data frame or |
file |
Optional HTML file path. If supplied, the card is saved there. |
open |
Logical. Open the saved file in a browser when |
An htmltools browsable object, invisibly when saved to file.
demo <- dna_example_customers() card <- dna_card(demo$customers_new)demo <- dna_example_customers() card <- dna_card(demo$customers_new)
Compare two data DNA profiles.
dna_compare(x, y)dna_compare(x, y)
x |
A data frame or |
y |
A data frame or |
A dna_comparison object.
demo <- dna_example_customers() dna_compare(demo$customers_old, demo$customers_new)demo <- dna_example_customers() dna_compare(demo$customers_old, demo$customers_new)
Explain mutations between two data DNA profiles.
dna_diff(x, y)dna_diff(x, y)
x |
A data frame or |
y |
A data frame or |
A dna_diff object containing a mutation table.
demo <- dna_example_customers() dna_diff(demo$customers_old, demo$customers_new)demo <- dna_example_customers() dna_diff(demo$customers_old, demo$customers_new)
Creates two small customer data frames designed to demonstrate DataDNA cards, comparison, and mutation reports.
dna_example_customers()dna_example_customers()
A list with customers_old and customers_new data frames.
demo <- dna_example_customers() str(demo$customers_old)demo <- dna_example_customers() str(demo$customers_old)
Finds the closest relatives of a query data set by comparing its data DNA
against a named library of data frames or data_dna objects.
dna_match(x, library, top_n = 5L, sample_size = 10000L)dna_match(x, library, top_n = 5L, sample_size = 10000L)
x |
A data frame or |
library |
A list of data frames or |
top_n |
Maximum number of matches to return. |
sample_size |
Maximum number of rows used when profiling raw data frames. |
A dna_match object.
demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) dna_match(demo$customers_new, lib)demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) dna_match(demo$customers_new, lib)
Creates a static HTML/SVG lineage diagram for a dna_match object.
dna_match_card(match, file = NULL, open = FALSE)dna_match_card(match, file = NULL, open = FALSE)
match |
A |
file |
Optional HTML file path. If supplied, the card is saved there. |
open |
Logical. Open the saved file in a browser when |
An htmltools browsable object, invisibly when saved to file.
demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) match <- dna_match(demo$customers_new, lib) dna_match_card(match)demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) match <- dna_match(demo$customers_new, lib) dna_match_card(match)
Creates a print-friendly, paper-style lineage figure for a dna_match
object using base R grid graphics. The figure can be drawn on the current
graphics device or saved directly to PNG or PDF.
dna_match_plot(match, file = NULL, width = 11, height = 7, dpi = 144)dna_match_plot(match, file = NULL, width = 11, height = 7, dpi = 144)
match |
A |
file |
Optional output path. Supported extensions are |
width |
Plot width in inches. |
height |
Plot height in inches. |
dpi |
Resolution used for PNG output. |
The input dna_match object, invisibly.
demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) match <- dna_match(demo$customers_new, lib) dna_match_plot(match)demo <- dna_example_customers() lib <- list(old = data_dna(demo$customers_old), new = data_dna(demo$customers_new)) match <- dna_match(demo$customers_new, lib) dna_match_plot(match)
Guess the species of a data frame.
dna_species(df)dna_species(df)
df |
A data frame. |
A character label such as customer_table, event_stream, or
wide_feature_matrix.
dna_species(dna_example_customers()$customers_new)dna_species(dna_example_customers()$customers_new)