How does Go know time.Now?
I thought about this just before I went to sleep the other day, and the answer was more interesting than I’d imagined!
This post may be a little longer than usual, so grab your coffees, grab your teas and without further ado, let’s dive in and see what we can come up with.
All code snippets point to a stable reference; in this case to release-branch.go1.16.
About time.Time
First off, it’s useful to understand just how time is embodied in Go.
The time.Time
struct can represent instants in time with nanosecond-precision. In order to more reliably measure elapsed time for comparisons, additions and subtractions, time.Time
may also contain an optional, nanosecond-precision reading of the current process’ monotonic clock. This is to avoid reporting erroneous durations, eg. in case of DST.
type Time struct {
wall uint64
ext int64
loc *Location
}
The Τime struct took this form back in early 2017; you can browse the relevant issue, proposal and implementation by the man himself, Russ Cox.
So, first off there’s wall
value providing a straightforward ‘wall clock’ reading, and ext
which provides this extended information in the form of the monotonic clock.
Breaking down wall
it contains a 1-bit hasMonotonic
flag in the highest bit; then 33 bits for keeping track of seconds; and finally 30 bits for keeping track of nanoseconds, in the [0, 999999999] range.
mSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
^ ^ ^
hasMonotonic seconds nanoseconds
The hasMonotonic
flag is always on for Go >= 1.9 and dates between 1885 to 2157, but due to the compatibility promise as well as extreme cases, Go makes sure these time values are handled correctly as well.
More precisely, here’s how the behavior differs:
If the hasMonotonic
bit is 1, then the 33-bit field stores the unsigned wall seconds since Jan 1 Year 1885, and ext holds a signed 64-bit monotonic clock reading, nanoseconds since process start. This is what usually happens in most of your code.
If the hasMonotonic
bit is 0, then the 33-bit field is zero, and the full signed 64-bit wall seconds since Jan 1 year 1 is stored in ext
as it did before the monotonic change.
Finally, each Time
value contains a Location, which is used when computing its presentation form; changing the location only changes the representation eg. when printing the value, it doesn’t affect the instant in time being stored. A nil location (the default) means ‘UTC’.
Again, to reiterate, to make things clear; usual time-telling operations use the wall clock reading, but time-measuring operations, specifically comparisons and subtractions, use the monotonic clock reading..
Great, but how is the current time calculated?
Here’s how time.Now()
and startNano
are defined in Go code.
// Monotonic times are reported as offsets from startNano.
var startNano int64 = runtimeNano() - 1
// Now returns the current local time.
func Now() Time {
sec, nsec, mono := now()
mono -= startNano
sec += unixToInternal - minWall
if uint64(sec)>>33 != 0 {
return Time{uint64(nsec), sec + minWall, Local}
}
return Time{hasMonotonic | uint64(sec)<<nsecShift | uint64(nsec), mono, Local}
}
The code is pretty straightforward if we peek at some constants
hasMonotonic = 1 << 63
unixToInternal int64 = (1969*365 + 1969/4 - 1969/100 + 1969/400) * secondsPerDay
wallToInternal int64 = (1884*365 + 1884/4 - 1884/100 + 1884/400) * secondsPerDay
minWall = wallToInternal // year 1885
nsecShift = 30
The if-branch checks whether we can fit the seconds value in the 33-bit field or we need to go with hasMonotonic=off
. As the monotonic draft mentions, 2^33 seconds are 272 years, so effectively we look whether we’re after year (1885+272=) 2157 to return early.
Otherwise, we end up with the usual hasMonotonic=on
case as described above.
Phew that was a lot!
I have to agree! But even with this information, there are two mysteries remaining;
Where are the unexported now()
and runtimeNano()
defined? and
Where does Local come from?
Here’s where it gets interesting!
Mystery No.1
Let’s start with the first question. Conventional logic would say that we’d look into the same package, but we’ll probably find nothing there!
These two functions are linkname’d from the runtime package.
// Provided by package runtime.
func now() (sec int64, nsec int32, mono int64)
// runtimeNano returns the current value of the runtime clock in nanoseconds.
//go:linkname runtimeNano runtime.nanotime
func runtimeNano() int64
As the linkname directive informs us, to find runtimeNano()
we’d have to search for runtime.nanotime()
, where we find two occurrences.
Similarly, if we continue looking in the runtime
package, we’ll come across
timestub.go
which contains the linknamed definition for time.Now() that uses walltime()
.
// Declarations for operating systems implementing time.now
// indirectly, in terms of walltime and nanotime assembly.
// +build !windows
...
//go:linkname time_now time.now
func time_now() (sec int64, nsec int32, mono int64) {
sec, nsec = walltime()
return sec, nsec, nanotime()
}
A ha! Now we’re getting somewhere!
Both walltime()
and nanotime()
feature a ‘fake’ implementation meant to be used for the Go playground, as well as the ‘real’ one, which calls to walltime1
and nanotime1
.
//go:nosplit
func nanotime() int64 {
return nanotime1()
}
func walltime() (sec int64, nsec int32) {
return walltime1()
}
In turn, both nanotime1
and walltime1
are defined for
several
different
platforms
and architectures.
Diving deeper
I apologize in advance for any erroneous statement; I’m sometimes like a deer caught in the headlights when confronted with assembly, but let’s try to understand how walltime is calculated for amd64 Linux here.
Please don’t hesitate to reach out for comments and corrections!
// func walltime1() (sec int64, nsec int32)
// non-zero frame-size means bp is saved and restored
TEXT runtime·walltime1(SB),NOSPLIT,$16-12
// We don't know how much stack space the VDSO code will need,
// so switch to g0.
// In particular, a kernel configured with CONFIG_OPTIMIZE_INLINING=n
// and hardening can use a full page of stack space in gettime_sym
// due to stack probes inserted to avoid stack/heap collisions.
// See issue #20427.
MOVQ SP, R12 // Save old SP; R12 unchanged by C code.
get_tls(CX)
MOVQ g(CX), AX
MOVQ g_m(AX), BX // BX unchanged by C code.
// Set vdsoPC and vdsoSP for SIGPROF traceback.
// Save the old values on stack and restore them on exit,
// so this function is reentrant.
MOVQ m_vdsoPC(BX), CX
MOVQ m_vdsoSP(BX), DX
MOVQ CX, 0(SP)
MOVQ DX, 8(SP)
LEAQ sec+0(FP), DX
MOVQ -8(DX), CX
MOVQ CX, m_vdsoPC(BX)
MOVQ DX, m_vdsoSP(BX)
CMPQ AX, m_curg(BX) // Only switch if on curg.
JNE noswitch
MOVQ m_g0(BX), DX
MOVQ (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack
noswitch:
SUBQ $16, SP // Space for results
ANDQ $~15, SP // Align for C code
MOVL $0, DI // CLOCK_REALTIME
LEAQ 0(SP), SI
MOVQ runtime·vdsoClockgettimeSym(SB), AX
CMPQ AX, $0
JEQ fallback
CALL AX
ret:
MOVQ 0(SP), AX // sec
MOVQ 8(SP), DX // nsec
MOVQ R12, SP // Restore real SP
// Restore vdsoPC, vdsoSP
// We don't worry about being signaled between the two stores.
// If we are not in a signal handler, we'll restore vdsoSP to 0,
// and no one will care about vdsoPC. If we are in a signal handler,
// we cannot receive another signal.
MOVQ 8(SP), CX
MOVQ CX, m_vdsoSP(BX)
MOVQ 0(SP), CX
MOVQ CX, m_vdsoPC(BX)
MOVQ AX, sec+0(FP)
MOVL DX, nsec+8(FP)
RET
fallback:
MOVQ $SYS_clock_gettime, AX
SYSCALL
JMP ret
As far as I can understand, here’s how the process goes.
-
Since we don’t know how much stack space the code will need we switch over to
g0
which is the first goroutine created for each OS thread, responsible for scheduling other goroutines. We keep track of the thread local storage usingget_tls
to load it into theCX
register and our current goroutine using a couple ofMOVQ
statements. -
The code then stores the values for
vdsoPC
andvdsoSP
(Program Counter and Stack Pointer) to restore them before exiting so that the function can be re-entrant. - The code checks whether it is already on
g0
, where it jumps tonoswitch
, otherwise changes tog0
with the following linesMOVQ m_g0(BX), DX MOVQ (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack
- Next up, it tries to load the address of
runtime·vdsoClockgettimeSym
into theAX
register; if it is not zero it calls it and moves on to theret
block where it retrieves the second and nanosecond values, restores the real Stack Pointer, restores the vDSO program counter and stack pointer and finally returnsMOVQ 0(SP), AX // sec MOVQ 8(SP), DX // nsec MOVQ R12, SP // Restore real SP // Restore vdsoPC, vdsoSP // We don't worry about being signaled between the two stores. // If we are not in a signal handler, we'll restore vdsoSP to 0, // and no one will care about vdsoPC. If we are in a signal handler, // we cannot receive another signal. MOVQ 8(SP), CX MOVQ CX, m_vdsoSP(BX) MOVQ 0(SP), CX MOVQ CX, m_vdsoPC(BX) MOVQ AX, sec+0(FP) MOVL DX, nsec+8(FP) RET
- On the other hand, if the address of
runtime·vdsoClockgettimeSym
is zero, then it jumps to thefallback
tag where it tries to use a different method to get the system’s time, that is$SYS_clock_gettime
MOVQ runtime·vdsoClockgettimeSym(SB), AX CMPQ AX, $0 JEQ fallback ... ... fallback: MOVQ $SYS_clock_gettime, AX SYSCALL JMP ret
The same file defines $SYS_clock_gettime
#define SYS_clock_gettime 228
which actually corresponds to the __x64_sys_clock_gettime
syscall when looking up the syscall table from the Linux source code!
What’s with these two different options?
The ‘preferred’ vdsoClockgettimeSym
mode is defined in vdsoSymbolKeys
var vdsoSymbolKeys = []vdsoSymbolKey{
{"__vdso_gettimeofday", 0x315ca59, 0xb01bca00, &vdsoGettimeofdaySym},
{"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym},
}
which matches the exported vDSO symbol found in the documentation.
Why is __vdso_clock_gettime
preferred over __x64_sys_clock_gettime
, and what’s the difference between them?
vDSO stands for virtual dynamic shared object and is a kernel mechanism for exporting a subset of the kernel space routines to user space applications so that these kernel space routines can be called in-process without incurring the performance penalty of switching from user mode to kernel mode.
The vDSO documentation contains the relevant example of gettimeofday
for explaining its benefits.
To quote the docs
There are some system calls the kernel provides that user-space code ends up using frequently, to the point that such calls can dominate overall performance. This is due both to the frequency of the call as well as the context- switch overhead that results from exiting user space and entering the kernel.
Making system calls can be slow, but triggering a software interrupt to tell the kernel you wish to make a system call is expensive as it goes through the full interrupt-handling paths in the processor’s microcode as well as in the kernel.
One frequently used system call is gettimeofday(2). This system call is called directly by user-space applications. This information is also not secret—any application in any privilege mode (root or any unprivileged user) will get the same answer. Thus the kernel arranges for the information required to answer this question to be placed in memory the process can access. Now a call to gettimeofday(2) changes from a system call to a normal function call and a few memory accesses.
So, the vDSO call is preferred as the method of getting clock information as it doesn’t have to go through the kernel’s interrupt-handling path, but can be called more quickly.
To wrap things up, the current time in Linux AMD64 is ultimately derived from either __vdso_clock_gettime
or the __x64_sys_clock_gettime
syscall. To ‘fool’ time.Now()
you’d have to tamper with either of these.
Windows Weirdness
An observant reader may ask, in timestub.go, we use // +build !windows
. What’s up with that?
Well, Windows implements time.Now()
directly in assembly and the result is linknamed from the timeasm.go
file.
We can see the relevant assembly code in sys_windows_amd64.s
.
As far as I understand, the code path here is somewhat similar to the Linux case. The first thing that the time·now
assembly does is check whether it can use QPC to obtain the time using the nowQPC
function.
CMPB runtime·useQPCTime(SB), $0
JNE useQPC
useQPC:
JMP runtime·nowQPC(SB)
RET
If that’s not the case the code will try to use the following two addresses from the KUSER_SHARED_DATA
structure, also known as SharedUserData
. This structure holds some kernel information that is shared with user-mode, in order to avoid multiple transitions to the kernel, similar to what vDSO does.
#define _INTERRUPT_TIME 0x7ffe0008
#define _SYSTEM_TIME 0x7ffe0014
KSYSTEM_TIME InterruptTime;
KSYSTEM_TIME SystemTime;
The part which uses these two addresses is presented below. The information is fetched as KSYSTEM_TIME
structs.
CMPB runtime·useQPCTime(SB), $0
JNE useQPC
MOVQ $_INTERRUPT_TIME, DI
loop:
MOVL time_hi1(DI), AX
MOVL time_lo(DI), BX
MOVL time_hi2(DI), CX
CMPL AX, CX
JNE loop
SHLQ $32, AX
ORQ BX, AX
IMULQ $100, AX
MOVQ AX, mono+16(FP)
MOVQ $_SYSTEM_TIME, DI
The issue with _SYSTEM_TIME
is that it is of lower resolution, having an update period of 100 nanoseconds; and that’s probably why QPC time is prefered.
It’s been so long since I’ve worked with Windows, but here’s some more resources if you’re interested.
Mystery No.2
What was that again? Oh, we haven’t figured where does Local come from?
The exported Local *Location
symbol points to the localLoc
address at first.
var Local *Location = &localLoc
If this address is nil, as we mentioned, the UTC location is returned. Otherwise, the code attempts to set up the package-level localLoc
variable by using the sync.Once
primitive the first time that location information is needed.
// localLoc is separate so that initLocal can initialize
// it even if a client has changed Local.
var localLoc Location
var localOnce sync.Once
func (l *Location) get() *Location {
if l == nil {
return &utcLoc
}
if l == &localLoc {
localOnce.Do(initLocal)
}
return l
}
The initLocal()
function looks for the contents of $TZ
to find a time zone to use.
If the $TZ
variable is unset, Go uses a system default file such as /etc/localtime
to load the timezone. If it is set, but empty, it uses UTC, while if it contains a non-valid timezone, it tries to find a file with the same name in the system timezone directory. The default sources that will be searched are
var zoneSources = []string{
"/usr/share/zoneinfo/",
"/usr/share/lib/zoneinfo/",
"/usr/lib/locale/TZ/",
runtime.GOROOT() + "/lib/time/zoneinfo.zip",
}
There are platform-specific zoneinfo_XYZ.go
files to find the default timezone using a similar logic eg. for Windows or WASM. In the past, when I wanted to use timezones in a stripped-down container image all I had to do was to add the following line to the Dockerfile when building from a Unix-like system.
COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo
On the other hand, in cases where we cannot control the build environment, there’s the tzdata
package which provides an embedded copy of the timezone database. If this package is imported anywhere or we build with the -tags timetzdata
flag, the program size will be increased by about ~450KB, but will also provide a fallback in cases that Go cannot find tzdata
file on the host system.
Finally, we can set up the location manually from our code by using the LoadLocation
function, eg. for testing purposes.
Outro
That’s all for today! I hope y’all either learned something new, or had some fun and you’re now more confident to jump into the Go codebase!
Feel free to reach out for comments, corrections or advice via email or on Twitter.
See you soon, take care of yourself!
Bonus : What’s with funcname1
in Go?
Throughout the Go codebase, you’ll see many references to funcname1()
or funcname2()
, especially as you’re getting to lower-level code.
As far as I understand they serve two purposes; they help keep up with Go’s Compatibility Promise by more easily altering the internals of an unexported function, and also to ‘group’ similar and/or chaining functionality together.
While someone may scoff at this, I think it’s a simple and great idea to keep the code readable and maintainable