How does Go know time.Now?

I thought about this just before I went to sleep the other day, and the answer was more interesting than I’d imagined!

This post may be a little longer than usual, so grab your coffees, grab your teas and without further ado, let’s dive in and see what we can come up with.

All code snippets point to a stable reference; in this case to release-branch.go1.16.

About time.Time

First off, it’s useful to understand just how time is embodied in Go.

The time.Time struct can represent instants in time with nanosecond-precision. In order to more reliably measure elapsed time for comparisons, additions and subtractions, time.Time may also contain an optional, nanosecond-precision reading of the current process’ monotonic clock. This is to avoid reporting erroneous durations, eg. in case of DST.

type Time struct {
	wall uint64
	ext  int64
	loc *Location
}

The Τime struct took this form back in early 2017; you can browse the relevant issue, proposal and implementation by the man himself, Russ Cox.

So, first off there’s wall value providing a straightforward ‘wall clock’ reading, and ext which provides this extended information in the form of the monotonic clock.

Breaking down wall it contains a 1-bit hasMonotonic flag in the highest bit; then 33 bits for keeping track of seconds; and finally 30 bits for keeping track of nanoseconds, in the [0, 999999999] range.

mSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
^                   ^                   ^
hasMonotonic        seconds             nanoseconds

The hasMonotonic flag is always on for Go >= 1.9 and dates between 1885 to 2157, but due to the compatibility promise as well as extreme cases, Go makes sure these time values are handled correctly as well.

More precisely, here’s how the behavior differs:

If the hasMonotonic bit is 1, then the 33-bit field stores the unsigned wall seconds since Jan 1 Year 1885, and ext holds a signed 64-bit monotonic clock reading, nanoseconds since process start. This is what usually happens in most of your code.

If the hasMonotonic bit is 0, then the 33-bit field is zero, and the full signed 64-bit wall seconds since Jan 1 year 1 is stored in ext as it did before the monotonic change.

Finally, each Time value contains a Location, which is used when computing its presentation form; changing the location only changes the representation eg. when printing the value, it doesn’t affect the instant in time being stored. A nil location (the default) means ‘UTC’.

Again, to reiterate, to make things clear; usual time-telling operations use the wall clock reading, but time-measuring operations, specifically comparisons and subtractions, use the monotonic clock reading..

Great, but how is the current time calculated?

Here’s how time.Now() and startNano are defined in Go code.

// Monotonic times are reported as offsets from startNano.
var startNano int64 = runtimeNano() - 1

// Now returns the current local time.
func Now() Time {
	sec, nsec, mono := now()
	mono -= startNano
	sec += unixToInternal - minWall
	if uint64(sec)>>33 != 0 {
		return Time{uint64(nsec), sec + minWall, Local}
	}
	return Time{hasMonotonic | uint64(sec)<<nsecShift | uint64(nsec), mono, Local}
}

The code is pretty straightforward if we peek at some constants

hasMonotonic         = 1 << 63
unixToInternal int64 = (1969*365 + 1969/4 - 1969/100 + 1969/400) * secondsPerDay
wallToInternal int64 = (1884*365 + 1884/4 - 1884/100 + 1884/400) * secondsPerDay
minWall              = wallToInternal               // year 1885
nsecShift            = 30

The if-branch checks whether we can fit the seconds value in the 33-bit field or we need to go with hasMonotonic=off. As the monotonic draft mentions, 2^33 seconds are 272 years, so effectively we look whether we’re after year (1885+272=) 2157 to return early.

Otherwise, we end up with the usual hasMonotonic=on case as described above.

Phew that was a lot!

I have to agree! But even with this information, there are two mysteries remaining;

Where are the unexported now() and runtimeNano() defined? and

Where does Local come from?

Here’s where it gets interesting!

Mystery No.1

Let’s start with the first question. Conventional logic would say that we’d look into the same package, but we’ll probably find nothing there!

These two functions are linkname’d from the runtime package.

// Provided by package runtime.
func now() (sec int64, nsec int32, mono int64)

// runtimeNano returns the current value of the runtime clock in nanoseconds.
//go:linkname runtimeNano runtime.nanotime
func runtimeNano() int64

As the linkname directive informs us, to find runtimeNano() we’d have to search for runtime.nanotime(), where we find two occurrences.

Similarly, if we continue looking in the runtime package, we’ll come across timestub.go which contains the linknamed definition for time.Now() that uses walltime().

// Declarations for operating systems implementing time.now
// indirectly, in terms of walltime and nanotime assembly.

// +build !windows
...
//go:linkname time_now time.now
func time_now() (sec int64, nsec int32, mono int64) {
	sec, nsec = walltime()
	return sec, nsec, nanotime()
}

A ha! Now we’re getting somewhere!

Both walltime() and nanotime() feature a ‘fake’ implementation meant to be used for the Go playground, as well as the ‘real’ one, which calls to walltime1 and nanotime1.

//go:nosplit
func nanotime() int64 {
	return nanotime1()
}

func walltime() (sec int64, nsec int32) {
	return walltime1()
}

In turn, both nanotime1 and walltime1 are defined for several different platforms and architectures.

Diving deeper

I apologize in advance for any erroneous statement; I’m sometimes like a deer caught in the headlights when confronted with assembly, but let’s try to understand how walltime is calculated for amd64 Linux here.

Please don’t hesitate to reach out for comments and corrections!

// func walltime1() (sec int64, nsec int32)
// non-zero frame-size means bp is saved and restored
TEXT runtime·walltime1(SB),NOSPLIT,$16-12
	// We don't know how much stack space the VDSO code will need,
	// so switch to g0.
	// In particular, a kernel configured with CONFIG_OPTIMIZE_INLINING=n
	// and hardening can use a full page of stack space in gettime_sym
	// due to stack probes inserted to avoid stack/heap collisions.
	// See issue #20427.

	MOVQ	SP, R12	// Save old SP; R12 unchanged by C code.

	get_tls(CX)
	MOVQ	g(CX), AX
	MOVQ	g_m(AX), BX // BX unchanged by C code.

	// Set vdsoPC and vdsoSP for SIGPROF traceback.
	// Save the old values on stack and restore them on exit,
	// so this function is reentrant.
	MOVQ	m_vdsoPC(BX), CX
	MOVQ	m_vdsoSP(BX), DX
	MOVQ	CX, 0(SP)
	MOVQ	DX, 8(SP)

	LEAQ	sec+0(FP), DX
	MOVQ	-8(DX), CX
	MOVQ	CX, m_vdsoPC(BX)
	MOVQ	DX, m_vdsoSP(BX)

	CMPQ	AX, m_curg(BX)	// Only switch if on curg.
	JNE	noswitch

	MOVQ	m_g0(BX), DX
	MOVQ	(g_sched+gobuf_sp)(DX), SP	// Set SP to g0 stack

noswitch:
	SUBQ	$16, SP		// Space for results
	ANDQ	$~15, SP	// Align for C code

	MOVL	$0, DI // CLOCK_REALTIME
	LEAQ	0(SP), SI
	MOVQ	runtime·vdsoClockgettimeSym(SB), AX
	CMPQ	AX, $0
	JEQ	fallback
	CALL	AX
ret:
	MOVQ	0(SP), AX	// sec
	MOVQ	8(SP), DX	// nsec
	MOVQ	R12, SP		// Restore real SP
	// Restore vdsoPC, vdsoSP
	// We don't worry about being signaled between the two stores.
	// If we are not in a signal handler, we'll restore vdsoSP to 0,
	// and no one will care about vdsoPC. If we are in a signal handler,
	// we cannot receive another signal.
	MOVQ	8(SP), CX
	MOVQ	CX, m_vdsoSP(BX)
	MOVQ	0(SP), CX
	MOVQ	CX, m_vdsoPC(BX)
	MOVQ	AX, sec+0(FP)
	MOVL	DX, nsec+8(FP)
	RET
fallback:
	MOVQ	$SYS_clock_gettime, AX
	SYSCALL
	JMP ret

As far as I can understand, here’s how the process goes.

Since we don’t know how much stack space the code will need we switch over to g0 which is the first goroutine created for each OS thread, responsible for scheduling other goroutines. We keep track of the thread local storage using get_tls to load it into the CX register and our current goroutine using a couple of MOVQ statements.
The code then stores the values for vdsoPC and vdsoSP (Program Counter and Stack Pointer) to restore them before exiting so that the function can be re-entrant.
The code checks whether it is already on g0, where it jumps to noswitch, otherwise changes to g0 with the following lines
```
MOVQ	m_g0(BX), DX
MOVQ	(g_sched+gobuf_sp)(DX), SP	// Set SP to g0 stack
```

Next up, it tries to load the address of runtime·vdsoClockgettimeSym into the AX register; if it is not zero it calls it and moves on to the ret block where it retrieves the second and nanosecond values, restores the real Stack Pointer, restores the vDSO program counter and stack pointer and finally returns

 MOVQ	0(SP), AX	// sec
 MOVQ	8(SP), DX	// nsec
 MOVQ	R12, SP		// Restore real SP
 // Restore vdsoPC, vdsoSP
 // We don't worry about being signaled between the two stores.
 // If we are not in a signal handler, we'll restore vdsoSP to 0,
 // and no one will care about vdsoPC. If we are in a signal handler,
 // we cannot receive another signal.
 MOVQ	8(SP), CX
 MOVQ	CX, m_vdsoSP(BX)
 MOVQ	0(SP), CX
 MOVQ	CX, m_vdsoPC(BX)
 MOVQ	AX, sec+0(FP)
 MOVL	DX, nsec+8(FP)
 RET

On the other hand, if the address of runtime·vdsoClockgettimeSym is zero, then it jumps to the fallback tag where it tries to use a different method to get the system’s time, that is $SYS_clock_gettime
```
 MOVQ	runtime·vdsoClockgettimeSym(SB), AX
 CMPQ	AX, $0
 JEQ	fallback
...
...
fallback:
 MOVQ	$SYS_clock_gettime, AX
 SYSCALL
 JMP ret
```

The same file defines $SYS_clock_gettime

#define SYS_clock_gettime	228

which actually corresponds to the __x64_sys_clock_gettime syscall when looking up the syscall table from the Linux source code!

What’s with these two different options?

The ‘preferred’ vdsoClockgettimeSym mode is defined in vdsoSymbolKeys

var vdsoSymbolKeys = []vdsoSymbolKey{
	{"__vdso_gettimeofday", 0x315ca59, 0xb01bca00, &vdsoGettimeofdaySym},
	{"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym},
}

which matches the exported vDSO symbol found in the documentation.

Why is __vdso_clock_gettime preferred over __x64_sys_clock_gettime, and what’s the difference between them?

vDSO stands for virtual dynamic shared object and is a kernel mechanism for exporting a subset of the kernel space routines to user space applications so that these kernel space routines can be called in-process without incurring the performance penalty of switching from user mode to kernel mode.

The vDSO documentation contains the relevant example of gettimeofday for explaining its benefits.

To quote the docs

There are some system calls the kernel provides that user-space code ends up using frequently, to the point that such calls can dominate overall performance. This is due both to the frequency of the call as well as the context- switch overhead that results from exiting user space and entering the kernel.

Making system calls can be slow, but triggering a software interrupt to tell the kernel you wish to make a system call is expensive as it goes through the full interrupt-handling paths in the processor’s microcode as well as in the kernel.

One frequently used system call is gettimeofday(2). This system call is called directly by user-space applications. This information is also not secret—any application in any privilege mode (root or any unprivileged user) will get the same answer. Thus the kernel arranges for the information required to answer this question to be placed in memory the process can access. Now a call to gettimeofday(2) changes from a system call to a normal function call and a few memory accesses.

So, the vDSO call is preferred as the method of getting clock information as it doesn’t have to go through the kernel’s interrupt-handling path, but can be called more quickly.

To wrap things up, the current time in Linux AMD64 is ultimately derived from either __vdso_clock_gettime or the __x64_sys_clock_gettime syscall. To ‘fool’ time.Now() you’d have to tamper with either of these.

Windows Weirdness

An observant reader may ask, in timestub.go, we use // +build !windows. What’s up with that?

Well, Windows implements time.Now() directly in assembly and the result is linknamed from the timeasm.go file.

We can see the relevant assembly code in sys_windows_amd64.s.

As far as I understand, the code path here is somewhat similar to the Linux case. The first thing that the time·now assembly does is check whether it can use QPC to obtain the time using the nowQPC function.

	CMPB	runtime·useQPCTime(SB), $0
	JNE	useQPC

useQPC:
	JMP	runtime·nowQPC(SB)
	RET

If that’s not the case the code will try to use the following two addresses from the KUSER_SHARED_DATA structure, also known as SharedUserData. This structure holds some kernel information that is shared with user-mode, in order to avoid multiple transitions to the kernel, similar to what vDSO does.

#define _INTERRUPT_TIME 0x7ffe0008
#define _SYSTEM_TIME 0x7ffe0014

KSYSTEM_TIME InterruptTime;
KSYSTEM_TIME SystemTime;

The part which uses these two addresses is presented below. The information is fetched as KSYSTEM_TIME structs.

	CMPB	runtime·useQPCTime(SB), $0
	JNE	useQPC
	MOVQ	$_INTERRUPT_TIME, DI
loop:
	MOVL	time_hi1(DI), AX
	MOVL	time_lo(DI), BX
	MOVL	time_hi2(DI), CX
	CMPL	AX, CX
	JNE	loop
	SHLQ	$32, AX
	ORQ	BX, AX
	IMULQ	$100, AX
	MOVQ	AX, mono+16(FP)

	MOVQ	$_SYSTEM_TIME, DI

The issue with _SYSTEM_TIME is that it is of lower resolution, having an update period of 100 nanoseconds; and that’s probably why QPC time is prefered.

It’s been so long since I’ve worked with Windows, but here’s some more resources if you’re interested.

Mystery No.2

What was that again? Oh, we haven’t figured where does Local come from?

The exported Local *Location symbol points to the localLoc address at first.

var Local *Location = &localLoc

If this address is nil, as we mentioned, the UTC location is returned. Otherwise, the code attempts to set up the package-level localLoc variable by using the sync.Once primitive the first time that location information is needed.

// localLoc is separate so that initLocal can initialize
// it even if a client has changed Local.
var localLoc Location
var localOnce sync.Once

func (l *Location) get() *Location {
	if l == nil {
		return &utcLoc
	}
	if l == &localLoc {
		localOnce.Do(initLocal)
	}
	return l
}

The initLocal() function looks for the contents of $TZ to find a time zone to use.

If the $TZ variable is unset, Go uses a system default file such as /etc/localtime to load the timezone. If it is set, but empty, it uses UTC, while if it contains a non-valid timezone, it tries to find a file with the same name in the system timezone directory. The default sources that will be searched are

var zoneSources = []string{
	"/usr/share/zoneinfo/",
	"/usr/share/lib/zoneinfo/",
	"/usr/lib/locale/TZ/",
	runtime.GOROOT() + "/lib/time/zoneinfo.zip",
}

There are platform-specific zoneinfo_XYZ.go files to find the default timezone using a similar logic eg. for Windows or WASM. In the past, when I wanted to use timezones in a stripped-down container image all I had to do was to add the following line to the Dockerfile when building from a Unix-like system.

COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo

On the other hand, in cases where we cannot control the build environment, there’s the tzdata package which provides an embedded copy of the timezone database. If this package is imported anywhere or we build with the -tags timetzdata flag, the program size will be increased by about ~450KB, but will also provide a fallback in cases that Go cannot find tzdata file on the host system.

Finally, we can set up the location manually from our code by using the LoadLocation function, eg. for testing purposes.

Outro

That’s all for today! I hope y’all either learned something new, or had some fun and you’re now more confident to jump into the Go codebase!

Feel free to reach out for comments, corrections or advice via email or on Twitter.

See you soon, take care of yourself!

Bonus : What’s with `funcname1` in Go?

Throughout the Go codebase, you’ll see many references to funcname1() or funcname2(), especially as you’re getting to lower-level code.

As far as I understand they serve two purposes; they help keep up with Go’s Compatibility Promise by more easily altering the internals of an unexported function, and also to ‘group’ similar and/or chaining functionality together.

While someone may scoff at this, I think it’s a simple and great idea to keep the code readable and maintainable

Written on March 30, 2021