from datetime import datetime
# Date example
datetime.strptime("01/02/2024", "%d/%m/%Y")datetime.datetime(2024, 2, 1, 0, 0)
# Type of date example
type(datetime.strptime("01/02/2024", "%d/%m/%Y"))<class 'datetime.datetime'>
In this chapter, we explore how to handle dates in Python. As a key data type, dates require special attention to ensure accurate manipulation and analysis. We’ll cover how to use regular expressions to recognize and format dates, and introduce the package datetime.
Dates are treated as a distinct data type in Python, which means Python provides specific tools for handling them correctly. This ensures that date values can be used reliably for operations such as comparisons, sorting, and time-based calculations. On the one hand, dates can be seen as characters and be written in many different ways. For example, we can write the 1st of January 2024 as "1/1/2024", "01/01/2024" or "2024/01/01". On the other hand, dates follow a specific sequence, which allows them to be treated as quantities for purposes such as sorting and comparing. For instance, we can see the date "02/01/2024" as a later date (and so as a “higher” value) than "01/01/2024".
In Python, we typically convert strings into date objects using the datetime module from Python’s standard library. One common method for this is strptime(), which takes two main inputs: the date string and the format that describes how the date is written. Since dates can appear in different formats, we must explicitly define this structure so Python knows how to interpret each part of the string. This is especially important for ambiguous formats such as “01/02/2024”, which can be interpreted differently depending on whether we use a day-month-year or month-day-year convention. In Python, we specify the format using special format codes such as %d for day, %m for month, and %Y for year. For instance, to specify that the date 01/02/2024 is “1st of February 2024” we use the format %d/%m/%Y, in which d comes from the word day, m comes from the word month and Y comes from the word year:
from datetime import datetime
# Date example
datetime.strptime("01/02/2024", "%d/%m/%Y")datetime.datetime(2024, 2, 1, 0, 0)
# Type of date example
type(datetime.strptime("01/02/2024", "%d/%m/%Y"))<class 'datetime.datetime'>
As shown, the output is no longer a string but a datetime object, meaning Python now understands it as a real date rather than plain text. This is important because it allows correct chronological operations such as sorting and comparison.
Due to the ambiguity in interpreting dates, a global standard numeric date format called ISO 8601 has been developed. The intuition of ISO 8601 is exactly the same as the output we received above: all components of a date appear in an order of decreasing units. In our example, the output starts with the year, then the month, and then the day. Additionally, each component has a fixed number of digits; the year has four digits, while the month and the day have two (including leading zeros if necessary).
Now that we’ve explored basic date transformations using the datetime module of Python, we can extend this idea to understand how patterns (regular expressions) relate to dates. As dates can be written in so many different ways, there are similarly many different regular expressions that we can use. In the context of dates, regular expressions are used to identify and extract date patterns from strings, allowing us to match different date formats (e.g., “01/01/2024” or “2024-01-01”) for conversion or manipulation. In other words, regular expressions (in this case) are character values that describe date formats by specifying the exact pattern in which the date components (day, month, and year) appear. This is particularly useful when working with messy or inconsistent date data, as regex can help filter or transform strings into a standardized format. By identifying common patterns in the data such as numbers separated by slashes or dashes, regex enables us to handle dates more efficiently, ensuring they are correctly parsed and ready for further operations like comparisons or calculations.
The table below provides the characters we can use in the format argument, along with their description and two examples (per character):
Character | Description | Example |
|---|---|---|
d | Numeric day of month | 5, 6 |
a | Abbrevation of day of the week | Mon, Tue |
A | Full name of day of week | Monday, Tuesday |
m | Numeric month of year | 5, 6 |
b | Abbrevation of month | Jun, Jul |
B | Full name of month | June, July |
y | Year without century | 23, 24 |
Y | Year with century | 2023, 2024 |
In the examples below, we see how to use the appropriate characters to transform different date representations, converting them to the standard ISO 8601 format:
from datetime import datetime
# Date: 3rd of March 2023
datetime.strptime("20230303", "%Y%m%d")datetime.datetime(2023, 3, 3, 0, 0)
# Date: 3rd of August 2022
datetime.strptime("2022-08/03", "%Y-%m/%d")datetime.datetime(2022, 8, 3, 0, 0)
# Date: 12th of July 2003
datetime.strptime("03 Jul 12", "%y %b %d")datetime.datetime(2003, 7, 12, 0, 0)
# Date: 23rd of June 2024
datetime.strptime("June 23/2024", "%B %d/%Y")datetime.datetime(2024, 6, 23, 0, 0)
Having properly represented dates, we can make comparisons between them, with the assumption that a most recent date is of higher value than an older date. In the examples below, we get the output True or False, depending on whether each statement holds:
# Is 3rd of March 2023 equal to 4th of March 2023?
datetime.strptime("20230303", "%Y%m%d") == datetime.strptime("20230403", "%Y%m%d")False
# Is 3rd of August 2022 less recent than 12th of July 2003?
datetime.strptime("2022-08/03", "%Y-%m/%d") < datetime.strptime("03 Jul 12", "%y %b %d")False
# Is 23rd of June 2024 more recent or equal to 23rd of June 2024?
datetime.strptime("Jun 23/2024", "%b %d/%Y") >= datetime.strptime("June 23/2024", "%B %d/%Y")True
We can also perform arithmetic with dates, such as adding or subtracting days, using the timedelta object. This works because dates are internally treated as time-based values:
from datetime import timedelta
# Adding 7 days to '1st of June 2025'
datetime.strptime("20250106", "%Y%d%m") + timedelta(days = 7)datetime.datetime(2025, 6, 8, 0, 0)
# Subtracting 4 days from '1st of June 2025'
datetime.strptime("20250106", "%Y%d%m") - timedelta(days = 4)datetime.datetime(2025, 5, 28, 0, 0)
# Difference in days between '8th of June 2025' and '28th of May 2025'
datetime.strptime("20250806", "%Y%d%m") - datetime.strptime("20252805", "%Y%d%m")datetime.timedelta(days=11)
Notice how in the last example the result of subtracting two dates is not just a plain number, but a timedelta object. This object stores the full time difference in a structured form. However, in many practical situations we are only interested in the number of days. In Python, we can easily extract this value as a component using .days:
# Extracting days from the date difference
(datetime.strptime("20250806", "%Y%d%m") - datetime.strptime("20252805", "%Y%d%m")).days11
Up to this point, we learned how to create a full date from individual character components. With datetime, we can also extract different parts from a date value, such as the year or the month. To see how we can do this, we create a datetime object:
# Creating date
d = datetime.strptime("2023-05-06", "%Y-%m-%d")
type(d)<class 'datetime.datetime'>
The object d is a datetime object, and each date component is essentially an attribute, which we can extract directly from the object. This means that once a date is properly parsed, we do not need additional functions to access its parts, since they are already stored within the object structure. In some cases, we may still use methods such as weekday() or timetuple(), but these operations are handled internally by the object, meaning the underlying calculations happen behind the scenes and are not something we need to manually compute. The code below shows how we can extract different date components from d:
# Extracting year from "2023-05-06"
d.year2023
# Extracting month from "2023-05-06"
d.month5
# Extracting day from "2023-05-06"
d.day6
# Extracting week day (from 1 to 7) from "2023-05-06"
d.weekday()5
# Extracting day of year (from 1 to 365 or 366) from "2023-05-06"
d.timetuple().tm_yday126
Lastly, we can print the current date using the method today() from the datetime module. This is particularly useful when working with dynamic scripts or reports that need to automatically reflect the current day without manual updates. Each time the code is run, it will return the system’s current date based on the clock of our computer.