Convert ICD data from wide to long format
Note the distinction between labelling existing data with any classes which
icd
provides, and actually converting the structure of the data.
wide_to_long( x, visit_name = get_visit_name(x), icd_labels = NULL, icd_name = "icd_code", icd_regex = c("icd", "diag", "dx_", "dx") )
x |
|
---|---|
visit_name | The name of the column in the data frame which contains the
patient or visit identifier. Typically this is the visit identifier, since
patients come leave and enter hospital with different ICD-9 codes. It is a
character vector of length one. If left empty, or |
icd_labels | vector of column names in which codes are found. If NULL,
all columns matching the regular expression |
icd_name | The name of the column in the |
icd_regex | vector of character strings containing a regular expression
to identify ICD-9 diagnosis columns to try (case-insensitive) in order.
Default is |
data.frame
with visit_name column named the same as input, and
a column named by icd.name
containing all the non-NA and non-empty
codes found in the wide input data.
Reshaping data is a common task, and is made easier here by knowing
more about the underlying structure of the data. This function wraps the
reshape
function with specific behavior and checks
related to ICD codes. Empty strings and NA values will be dropped, and
everything else kept. No validation of the ICD codes is done.
As is common with many data sets, key variables can be concentrated in one column or spread over several. Tools format of clinical and administrative hospital data, we can perform the conversion efficiently and accurately, while keeping some metadata about the codes intact, e.g. whether they are ICD-9 or ICD-10.
Long or wide format ICD data are all expected to be
in a data frame. The data.frame
itself does not carry any ICD
classes at the top level, even if it only contains one type of code;
whereas its constituent columns may have a class specified, e.g.
icd9
or icd10who
.
Other ICD data conversion:
comorbid_df_to_mat()
,
comorbid_mat_to_df()
,
convert
,
decimal_to_short()
,
long_to_wide()
,
short_to_decimal()
widedf <- data.frame( visit_name = c("a", "b", "c"), icd9_01 = c("441", "4424", "441"), icd9_02 = c(NA, "443", NA) ) wide_to_long(widedf)#> visit_name icd_code #> 1 a 441 #> 2 b 4424 #> 3 b 443 #> 4 c 441