library(tidyverse)
library(gt)
exibble## # A tibble: 8 × 9
## num char fctr date time datetime currency row group
## <dbl> <chr> <fct> <chr> <chr> <chr> <dbl> <chr> <chr>
## 1 0.111 apricot one 2015-01-15 13:35 2018-01-01… 50.0 row_1 grp_a
## 2 2.22 banana two 2015-02-15 14:40 2018-02-02… 18.0 row_2 grp_a
## 3 33.3 coconut three 2015-03-15 15:45 2018-03-03… 1.39 row_3 grp_a
## 4 444. durian four 2015-04-15 16:50 2018-04-04… 65100 row_4 grp_a
## 5 5550 <NA> five 2015-05-15 17:55 2018-05-05… 1326. row_5 grp_b
## 6 NA fig six 2015-06-15 <NA> 2018-06-06… 13.3 row_6 grp_b
## 7 777000 grapefruit seven <NA> 19:10 2018-07-07… NA row_7 grp_b
## 8 8880000 honeydew eight 2015-08-15 20:20 <NA> 0.44 row_8 grp_b
3 Formatting
{gt}
has two families of functions that handle a lot of the data formatting parts. And you have already seen members of these families, namely fmt_number()
and sub_zero()
. In this chapter, we’re going to discover some of their siblings.
The functions in these families are structured the same. So, if you can work with one, then you can work with them all. That’s why we’re not going to cover them all with examples here. For a full list of these functions take a look at the {gt}
docs.
3.1 fmt_* functions
First, we need some example data to practice on. Thankfully, {gt}
already comes with data sets that use many different data formats. Let me introduce you to {gt}
’s example tibble, or exibble
for short.
Let’s put this into a {gt}
table. We’re going to use one of the pre-defined themes that come with opt_stylize()
.
|>
exibble select(-(row:group)) |>
gt() |>
opt_stylize(style = 3)
num | char | fctr | date | time | datetime | currency |
---|---|---|---|---|---|---|
Phew! This is table won’t win awards any time soon. Let’s clean it up by working us through the columns one by one.
3.1.1 Numbers
First, we’re getting rid of the scientific notation in the num
column. While we’re at it, we’re going to round the numbers to one decimal.
|>
exibble select(num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(
columns = 'num',
decimals = 1
)
num |
---|
Next, we may want to adjust our marks ,
and .
in the output. For example, in German we write one million as 1.000.000
and a quarter as 0,25
. Hence, we could change the sep_mark
and dec_mark
argument in fmt_number()
. But the easier way is to just change the locale
to "de"
(German).
|>
exibble select(num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(
columns = 'num',
decimals = 1,
locale = 'de'
)
num |
---|
Since we also have some very large numbers in the num
column, we could add suffixes instead of displaying a lot of zeroes. This means that we transform e.g. 1000 to 1K.
|>
exibble select(num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(
columns = 'num',
decimals = 1,
suffixing = TRUE
)
num |
---|
We could also use our own suffixes.
# Thousand - Million - Billion - Trillion
<- c("k", "mil", "bil", "tril")
custom_suffixes
|>
exibble select(num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(
columns = 'num',
decimals = 1,
suffixing = custom_suffixes
)
num |
---|
3.1.2 Currency
Now, let’s format the currency
column. The default currency
is USD
. That will give you $ signs.
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_currency(columns = 'currency')
num | currency |
---|---|
Since I mostly use Euros in my real life, let me change the currency
argument here. Also, we’re going to set locale
to German again.
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_currency(
columns = 'currency', currency = 'EUR', locale = 'de'
)
num | currency |
---|---|
You’d think that this is the correct way to state a price in Germany. But it’s not. Unfortunately, the locale did not catch that we use the Euro symbol at the end of a number. But no worries, we can fix that manually.
Instead of fmt_currency()
, we’re going to use fmt_number()
and apply the Euro symbol manually via pattern
. The fmt_*()
functions use {x}
as placeholder for the function’s regular output. That way, we can modify outputs as we see fit. Here are two examples.
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
)
num | currency |
---|---|
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = 'EUR {x}'
)
num | currency |
---|---|
We have rounded the num
column to one decimal with the first fmt_number()
layer. It’s interesting to find out what happens if we had targeted the currency
column in that layer too. Would the next fmt_number()
layer round the previously rounded number or the original number? Let’s check.
Code
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = c('num', 'currency'), decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
)
num | currency |
---|---|
Code
|>
exibble select(num, currency) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
)
num | currency |
---|---|
As you can see, the output is the same. This means that the fmt_*()
functions always use the original data. That’s good to know.
Fun fact: That’s also what’s happening when you rename a column with cols_label()
. Internally, the column names always remain the same.
3.1.3 Dates, times and datetimes
We can format any date using fmt_date()
. And there are quite a few date_style
s we can choose from. Here, are a few examples.
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', date_style = "wday_month_day_year")
num | currency | date |
---|---|---|
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', date_style = "day_m_year")
num | currency | date |
---|---|---|
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', date_style = "yMMMd")
num | currency | date |
---|---|---|
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', date_style = "yMMMEd")
num | currency | date |
---|---|---|
To see the full list of available styles, you can run info_date_style()
.
info_date_style()
Date Formatting Options | |||
Usable in the fmt_date() and fmt_datetime() functions. |
|||
Format Name | Formatted Date (en) | ||
---|---|---|---|
1 | |||
2 | |||
3 | |||
4 | |||
5 | |||
6 | |||
7 | |||
8 | |||
9 | |||
10 | |||
11 | |||
12 | |||
13 | |||
14 | |||
15 | |||
16 | |||
17 | |||
18 | |||
19 | |||
20 | |||
21 | |||
22 | |||
23 | |||
24 | |||
25 | |||
26 | |||
27 | |||
28 | |||
29 | |||
30 | |||
31 | |||
32 | |||
33 | |||
34 | |||
35 | |||
36 | |||
37 | |||
38 | |||
39 | |||
40 | |||
41 |
Notice that some of these styles are labeled as flexible. This means that they will adjust to locales. Beware that month names may adapt to the locale but not the formatting.
Here’s an example of that with day_m_year
(not flexible) and yMMMd
(flexible) using the German locale. Notice how day_m_year
does not set a .
after the day but yMMMd
does. The latter is the correct German formatting.
Code
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(
columns = 'date',
locale = 'de',
date_style = "day_m_year"
)
num | currency | date |
---|---|---|
Code
|>
exibble select(num, currency, date) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(
columns = 'date',
locale = 'de',
date_style = "yMMMd"
)
num | currency | date |
---|---|---|
Formatting time works basically the same, so I’m just going to show one example.1
|>
exibble select(num, currency, date, time) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', locale = 'de', date_style = "yMMMd") |>
fmt_time(columns = 'time', time_style = "Hms")
num | currency | date | time |
---|---|---|---|
I have a date. I have a time. Uh! Datetime, cf. PPAP2.
Working with these magical columns is exactly what you’d expect. You use fmt_datetime()
which has a date_style
and a time_style
argument.
|>
exibble select(num, currency, date, time, datetime) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', locale = 'de', date_style = "yMMMd") |>
fmt_time(columns = 'time', time_style = "Hms") |>
fmt_datetime(
columns = 'datetime',
date_style = "yMMMd",
time_style = "Hms"
)
num | currency | date | time | datetime |
---|---|---|---|---|
3.1.4 Markdown
We can also use Markdown and therefore HTML + CSS in our tables. Let’s use that to make our table a bit colorful. For example, we could wrap elements from the currency
column into <span>
-tags to colorize them.
|>
exibble select(num, currency, date, time, datetime) |>
mutate(
currency = str_c(
'<span style="color:red;font-size:20pt">',
currency,'€</span>'
)|>
) gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_date(columns = 'date', locale = 'de', date_style = "yMMMd") |>
fmt_time(columns = 'time', time_style = "Hms") |>
fmt_datetime(
columns = 'datetime',
date_style = "yMMMd",
time_style = "Hms"
|>
) fmt_markdown(columns = 'currency')
num | currency | date | time | datetime |
---|---|---|---|---|
This is one way you could style your table. But I’ve used this way only for demo purposes. We’ll learn more about styling in Chapter 4.
The real power of the fmt_markdown()
layer is that you can put any html into the table and it will be formatted correctly afterwards. For example, I’ve copied the svg-code (which can be used in HTML) for the R logo from Wikipedia. Putting this code a {gt}
table and using fmt_markdown()
, let’s me use the R logo.
## factor to apply to original width and height of svg from Wikipedia
<- 0.5
scale_size
<- glue::glue('
r_logo_svg <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" preserveAspectRatio="xMidYMid" width="{724 * scale_size}" height="{561 * scale_size}" viewBox="0 0 724 561">
<defs>
<linearGradient id="gradientFill-1" x1="0" x2="1" y1="0" y2="1" gradientUnits="objectBoundingBox" spreadMethod="pad">
<stop offset="0" stop-color="rgb(203,206,208)" stop-opacity="1"/>
<stop offset="1" stop-color="rgb(132,131,139)" stop-opacity="1"/>
</linearGradient>
<linearGradient id="gradientFill-2" x1="0" x2="1" y1="0" y2="1" gradientUnits="objectBoundingBox" spreadMethod="pad">
<stop offset="0" stop-color="rgb(39,109,195)" stop-opacity="1"/>
<stop offset="1" stop-color="rgb(22,92,170)" stop-opacity="1"/>
</linearGradient>
</defs>
<path d="M361.453,485.937 C162.329,485.937 0.906,377.828 0.906,244.469 C0.906,111.109 162.329,3.000 361.453,3.000 C560.578,3.000 722.000,111.109 722.000,244.469 C722.000,377.828 560.578,485.937 361.453,485.937 ZM416.641,97.406 C265.289,97.406 142.594,171.314 142.594,262.484 C142.594,353.654 265.289,427.562 416.641,427.562 C567.992,427.562 679.687,377.033 679.687,262.484 C679.687,147.971 567.992,97.406 416.641,97.406 Z" fill="url(#gradientFill-1)" fill-rule="evenodd"/>
<path d="M550.000,377.000 C550.000,377.000 571.822,383.585 584.500,390.000 C588.899,392.226 596.510,396.668 602.000,402.500 C607.378,408.212 610.000,414.000 610.000,414.000 L696.000,559.000 L557.000,559.062 L492.000,437.000 C492.000,437.000 478.690,414.131 470.500,407.500 C463.668,401.969 460.755,400.000 454.000,400.000 C449.298,400.000 420.974,400.000 420.974,400.000 L421.000,558.974 L298.000,559.026 L298.000,152.938 L545.000,152.938 C545.000,152.938 657.500,154.967 657.500,262.000 C657.500,369.033 550.000,377.000 550.000,377.000 ZM496.500,241.024 L422.037,240.976 L422.000,310.026 L496.500,310.002 C496.500,310.002 531.000,309.895 531.000,274.877 C531.000,239.155 496.500,241.024 496.500,241.024 Z" fill="url(#gradientFill-2)" fill-rule="evenodd"/>
</svg>
')
tibble(logo = r_logo_svg) |>
gt() |>
fmt_markdown(columns = 'logo')
logo |
---|
This fmt_markdown()
technique is super powerful. We could even use it to nest {gt}
-tables (which are HTML) inside of each other. That’s what we’ll do in Chapter 5 to create elaborate tables.
3.1.5 Any data format
There are some more fmt_*()
functions for specific formats. Once again, you can look at them in the docs. Instead of showing them all, let me finish off this section with the most powerful function of them all. That’s fmt()
.
You can just apply any function that you like for formatting. For example, you could convert text entries to all-caps with str_to_upper()
.
|>
exibble select(num, currency, date, time, datetime, char) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', locale = 'de', date_style = "yMMMd") |>
fmt_time(columns = 'time', time_style = "Hms") |>
fmt_datetime(
columns = 'datetime',
date_style = "yMMMd",
time_style = "Hms"
|>
) fmt(columns = 'char', fn = str_to_upper)
num | currency | date | time | datetime | char |
---|---|---|---|---|---|
Or you could write your own time-formatting function.
<- function(time, target) {
on_time_format if_else(parse_time(time) <= target, 'on time', 'too late')
}
|>
exibble select(time) |>
mutate(rep_time = time) |>
gt() |>
opt_stylize(style = 3) |>
fmt(
columns = 'rep_time',
fns = function(x) {
on_time_format(x, hms::hms(hours = 16, minutes = 30))
} )
time | rep_time |
---|---|
3.2 sub_ functions
The sub_*()
functions are straightforward to use. There are five functions that you can use.
sub_missing()
replacesNA
valuessub_zero()
replaces zeroessub_large_values()
replaces large values (according to some threshold)sub_small_values()
does… I think you can guess itsub_values()
can replace large numbers or texts that match a regex
The first two are straight-forward to use. By default, they apply to the whole data. But you can also target only specific columns and rows by changing the columns
and rows
argument.
|>
exibble select(num, currency, date, time, datetime) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = 'num', decimals = 1) |>
fmt_number(
columns = 'currency',
decimals = 2,
locale = 'de',
pattern = '{x}€'
|>
) fmt_date(columns = 'date', locale = 'de', date_style = "yMMMd") |>
fmt_time(columns = 'time', time_style = "Hms") |>
fmt_datetime(
columns = 'datetime',
date_style = "yMMMd",
time_style = "Hms"
|>
) sub_missing(missing_text = '----------')
num | currency | date | time | datetime |
---|---|---|---|---|
tibble(demo_column = -3:3) |>
gt() |>
opt_stylize(style = 3) |>
sub_zero(zero_text = 'ZERO, WATCH OUT WHOOP WHOOP')
demo_column |
---|
With sub_small_vals()
and sub_large_vals()
you have to be a bit careful about the sign of the number you’re replacing. Both functions will replace only positive or negative numbers. So, if you want to replace positive and negative numbers, you have to use the layers multiple times.
tibble(x = c(-100, 100, 0.01, -0.01), demo_col = x) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = where(is.numeric)) |>
sub_small_vals(
columns = 'demo_col', threshold = 1, sign = '+'
|>
) sub_large_vals(
columns = 'demo_col', threshold = 50, sign = '+'
)
x | demo_col |
---|---|
tibble(x = c(-100, 100, 0.01, -0.01), demo_col = x) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = where(is.numeric)) |>
sub_small_vals(
columns = 'demo_col', threshold = 1, sign = '-'
|>
) sub_large_vals(
columns = 'demo_col', threshold = 50, sign = '-'
)
x | demo_col |
---|---|
tibble(x = c(-100, 100, 0.01, -0.01), demo_col = x) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = where(is.numeric)) |>
sub_small_vals(
columns = 'demo_col', threshold = 1, sign = '+'
|>
) sub_large_vals(
columns = 'demo_col', threshold = 50, sign = '+'
|>
) sub_small_vals(
columns = 'demo_col', threshold = 1, sign = '-'
|>
) sub_large_vals(
columns = 'demo_col', threshold = 50, sign = '-'
)
x | demo_col |
---|---|
The last sub_*()
function is sub_values()
. It is the most powerful function of the sub_*()
family because it can replace numbers and texts. To do that it has a values
and pattern
argument. In case you’re wondering, you can only use one of them at a time. If you specify both, pattern
will always take precedence.
But there’s more. It also has an fn
argument. You could use it to let an arbitrary function decide which values get replaced. In order for this to work, this function must take a column and return a TRUE
/FALSE
vector of the same length.
Let’s take a look at a couple of examples.
|>
exibble select(num) |>
mutate(demo_col = num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = everything()) |>
sub_values(
columns = 'demo_col',
values = c(0.111, 777000),
replacement = 'REPLACED'
)
num | demo_col |
---|---|
|>
exibble select(char) |>
mutate(demo_col = char) |>
gt() |>
opt_stylize(style = 3) |>
sub_values(
columns = 'demo_col',
pattern = '(a|e)',
replacement = 'fruit contains an a or e'
)
char | demo_col |
---|---|
|>
exibble select(num) |>
mutate(demo_col = num) |>
gt() |>
opt_stylize(style = 3) |>
fmt_number(columns = everything()) |>
sub_values(
columns = 'demo_col',
fn = function(x) between(x, 10, 10000),
replacement = 'Between 10 and 10000'
)
num | demo_col |
---|---|
3.3 Summary
That’s a wrap on Chapter 3. We’ve got the formatting options covered. Time to get to the most complicated part of our tables: Their theme.
Just like in a ggplot we can style more or less every part of our table. And if you’re familiar with HTML/CSS you can even apply custom styles that have not been implemented in {gt}
yet.
You should probably know that there seems to be an issue with some of the time formats when you’re also using
{renv}
. At least I’ve run into some troubles with that (see Issue). But if your desired format does not work, you can always format it manually. Either before sending the data togt()
or withfmt()
which we’ll cover in a sec.↩︎My brain randomly reminded me of some dumb internet stuff from 6 years ago. So naturally I had to incorporate it into my text somehow. And of course the YouTube algorithm had to remind me of more fun stuff.↩︎