diff --git a/NEWS.md b/NEWS.md index 021ddbbb6..f62bfe6a3 100644 --- a/NEWS.md +++ b/NEWS.md @@ -50,6 +50,8 @@ 9. `fread()` no longer replaces a literal header column name `"NA"` with an auto-generated `Vn` name when `na.strings` includes `"NA"`, [#5124](https://github.com/Rdatatable/data.table/issues/5124). Data rows still continue to parse `"NA"` as missing. Thanks @Mashin6 for the report and @shrektan for the fix. +10. `yearqtr()` and `yearmon()` gain an optional `format` argument [#7694](https://github.com/Rdatatable/data.table/issues/7694). 'numeric' is the default and preserves the original behavior, while 'character' formats the date as YYYYQ# (e.g., 2025Q2) for `yearqtr()` and YYYYM# (e.g., 2025M4) for `yearmon()`. Thanks to @jan-swissre for the report and @LunaticSage218 for the implementation. + ### Notes 1. {data.table} now depends on R 3.5.0 (2018). diff --git a/R/IDateTime.R b/R/IDateTime.R index 49fa5abda..0bb2568c8 100644 --- a/R/IDateTime.R +++ b/R/IDateTime.R @@ -365,8 +365,30 @@ isoyear = function(x) as.integer(format(as.IDate(x), "%G")) month = function(x) convertDate(as.IDate(x), "month") quarter = function(x) convertDate(as.IDate(x), "quarter") year = function(x) convertDate(as.IDate(x), "year") -yearmon = function(x) convertDate(as.IDate(x), "yearmon") -yearqtr = function(x) convertDate(as.IDate(x), "yearqtr") +yearmon = function(x, format=c("numeric", "character")) { + format = match.arg(format) + x_as_idate = as.IDate(x) + ymon = convertDate(x_as_idate, "yearmon") + if (format == "numeric") return(ymon) + ans = rep(NA_character_, length(x_as_idate)) + ok = !is.na(x_as_idate) + yr = floor(ymon[ok]) + mon = round((ymon[ok] - yr) * 12) + 1L + ans[ok] = paste0(yr, "M", mon) + ans +} +yearqtr = function(x, format=c("numeric", "character")) { + format = match.arg(format) + x_as_idate = as.IDate(x) + yqtr = convertDate(x_as_idate, "yearqtr") + if (format == "numeric") return(yqtr) + ans = rep(NA_character_, length(x_as_idate)) + ok = !is.na(x_as_idate) + yr = floor(yqtr[ok]) + qtr = round((yqtr[ok] - yr) * 4) + 1L + ans[ok] = paste0(yr, "Q", qtr) + ans +} convertDate = function(x, type) { type = match.arg(type, c("yday", "wday", "mday", "week", "month", "quarter", "year", "yearmon", "yearqtr")) diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 443487c6a..c1dc6561b 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -21585,3 +21585,18 @@ close(con) file.create(f <- tempfile()) test(2367.6, fread(file(f)), data.table(), warning="Connection has size 0.") unlink(f) + +# yearqtr() could optionally output 2025Q4 format #7694 +x = c("1111-11-11", "2019-01-01", "2019-02-28", "2019-03-01", "2019-12-31", "2020-02-29", "2020-03-01", "2020-12-31", "2040-01-01", "2040-12-31", "2100-03-01", NA) +test(2368.1, yearqtr(x, format="numeric"), c(1111.75, 2019, 2019, 2019, 2019.75, 2020, 2020, 2020.75, 2040, 2040.75, 2100, NA)) +test(2368.2, yearqtr(x, format="numeric"), yearqtr(x)) # numeric is the default, preserves backwards compatibility +test(2368.3, yearqtr(x, format="character"), c("1111Q4", "2019Q1", "2019Q1", "2019Q1", "2019Q4", "2020Q1", "2020Q1", "2020Q4", "2040Q1", "2040Q4", "2100Q1", NA_character_)) +test(2368.4, yearqtr("2016-08-03 01:02:03.45", format="character"), "2016Q3") +test(2368.5, yearqtr(NA, format="character"), NA_character_) + +# yearmon() could optionally output 2025M4 format #7694 +test(2369.1, yearmon(x, format="numeric"), c(1111+10/12, 2019, 2019+1/12, 2019+2/12, 2019+11/12, 2020+1/12, 2020+2/12, 2020+11/12, 2040, 2040+11/12, 2100+2/12, NA)) +test(2369.2, yearmon(x, format="numeric"), yearmon(x)) # numeric is the default, preserves backwards compatibility +test(2369.3, yearmon(x, format="character"), c("1111M11", "2019M1", "2019M2", "2019M3", "2019M12", "2020M2", "2020M3", "2020M12", "2040M1", "2040M12", "2100M3", NA_character_)) +test(2369.4, yearmon("2016-08-03 01:02:03.45", format="character"), "2016M8") +test(2369.5, yearmon(NA, format="character"), NA_character_) diff --git a/man/IDateTime.Rd b/man/IDateTime.Rd index cf762337e..a66ccd888 100644 --- a/man/IDateTime.Rd +++ b/man/IDateTime.Rd @@ -97,9 +97,8 @@ isoyear(x) month(x) quarter(x) year(x) -yearmon(x) -yearqtr(x) - +yearmon(x, format = c("numeric", "character")) +yearqtr(x, format = c("numeric", "character")) } \arguments{ @@ -115,6 +114,7 @@ yearqtr(x) the S3 generic.} \item{units}{one of the units listed for truncating. May be abbreviated.} \item{ms}{ For \code{as.ITime} methods, what should be done with sub-second fractions of input? Valid values are \code{'truncate'} (floor), \code{'nearest'} (round), and \code{'ceil'} (ceiling). See Details. } + \item{format}{format is either \code{"numeric"} (default) or \code{"character"}. \code{"character"} formats the result as \code{"2025M4"} for \code{yearmon} and \code{"2025Q4"} for \code{yearqtr}.} } \details{ \code{IDate} is a date class derived from \code{Date}. It has the same @@ -209,7 +209,11 @@ Similarly, \code{isoyear()} returns the ISO 8601 year corresponding to the ISO w for second, minute, hour, day of year, day of week, day of month, week, month, quarter, and year, respectively. \code{yearmon} and \code{yearqtr} return double values representing - respectively \code{year + (month-1) / 12} and \code{year + (quarter-1) / 4}. + respectively \code{year + (month-1) / 12} and \code{year + (quarter-1) / 4} + when \code{format = "numeric"} (the default). When \code{format = "character"}, + they return character vectors of the form \code{"YYYYMM"} (e.g. \code{"2025M4"}) + and \code{"YYYYQN"} (e.g. \code{"2025Q4"}) respectively, with \code{NA} input + returned as \code{NA_character_}. \code{second}, \code{minute}, \code{hour} are taken directly from the \code{POSIXlt} representation.