List of Figures

List of Tables

About this manual

The copyright of this document belongs to Loongson Technology Corporation Limited. Without written permission, no company or individualmay disclose, reproduce or otherwise distribute any part of this document to third parties. Otherwise, they will be held legally responsible.

Disclaimer

This document provides only periodic information, and the contents contained may be updated at any time without notice, depending on the actual situation of the product. Loongson Technology Corporation Limited is not responsible for any direct or indirect damage aused by the improper use of the document.

Loongson Technology Corporation Limited

Building No.2, Loongson Industrial Park,
Zhongguancun Environmental Protection Park, Haidian District, Beijing

Tel: 010-62546668

Fax: 010-62600826

Reading Guide

This manual introduces the Loongson 3A5000/3B5000 multicore processor architecture and register descriptions. It provides detailed descriptions of the chip system architecture, functions and configurations of the main modules, register lists and bit fields.

Translator’s Note

These documents were translated by Yanteng Si and Feiyang Chen.

Due to the limited knowledge of the translators, there are some inevitable errors and omissions existing in this document, please feel free to correct.

License

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Contributors

Since the release of the project, we have gotten several errata and content changes donated. Here are all the people who have contributed to LoongArch Documentation as an open source project. Thank you everyone for helping make this a better book for everyone.

The contributors are listed in alphabetical order.

Chao LI <lichao@loongson.cn>
Chenghua Xu <xuchenghua@loongson.cn>
Feiyang Chen <chenfeiyang@loongson.cn>
FreeFlyingSheep <fyang.168.hi@163.com>
Konstantin Romanov <konstantinsromanov@gmail.com>
LI Chao <lichao@loongson.cn>
limeidan <limeidan@loongson.cn>
liuzhensong <liuzhensong@loongson.cn>
mengqinggang <mengqinggang@loongson.cn>
Qi Hu <huqi@loongson.cn>
qmuntal <quimmuntal@gmail.com>
tangxiaolin <tangxiaolin@loongson.cn>
WANG Xuerui <git@xen0n.name>
wangguofeng <wangguofeng@loongson.cn>
Wu Xiaotian <wuxiaotian@loongson.cn>
Wu Xiaotian <yetist@gmail.com>
Xi Ruoyao <xry111@mengyan1223.wang>
Yang Yujie <yangyujie@alumni.sjtu.edu.cn>
Yang Yujie <yangyujie@loongson.cn>
Yanteng Si <siyanteng@loongson.cn>

1. Introduction

1.1. Introduction to the Loongson Family of Processors

Loongson processors mainly include three series. Loongson Series 1 processor adopts 32-bit processor cores and integrates various peripheral interfaces to form application-specific monolithic solutions, which are mainly applied to IOT terminals, instrumentation devices, data acquisition and other fields. Loongson Series 2 processor adopts 32-bit/64-bit processor cores and integrates various peripheral interfaces to form a high-performance low-power SoC chip for network devices, industrial terminals, intelligent manufacturing, etc. Loongson Series 3 processors integrate multiple 64-bit processor cores and necessary storage and IO interfaces on-chip, targeting high-end embedded computers, desktops, servers and other applications.

The Loongson 3 multi-core series processors are designed based on a scalable multi-core interconnect architecture, which integrates multiple high-performance processor cores and a large amount of Level 2 Cache on a single chip, and interconnects multiple chips through high-speed I/O interfaces to form a larger scale system.

The scalable interconnect architecture adopted by Loongson 3 is shown in the figure below. Each node consists of 8 × 8 cross-switches, with each cross-switch connecting four processor cores and four shared caches, and interconnecting with other nodes in four directions: East (E), South (N), West (W), and North (N).

loongson 3 system architecture
Figure 1. Loongson 3 System architecture

The node structure of the Loongson 3 is shown in the figure below. Each node has two levels of AXI cross-switches connecting the processor, the shared Cache, the memory controller, and the I/O controller. The first level AXI cross-switch (called X1 Switch) connects the processor and the shared Cache, and the second level cross-switch (called X2 Switch) connects the shared Cache and the memory controller.

loongson 3 node structure
Figure 2. Loongson 3 node structure

In each node, up to 8 × 8 X1 cross switches are connected to four processor cores (P0, P1, P2, P3 in the figure) through four Master ports. The four interleave shared Cache blocks (S0, S1, S2, S3 in the figure) are universally addressed through the four slave ports. Other nodes or I/O nodes in the East, South, West and North directions are connected via four pairs of Master/Slave ports (EM/ES, SM/SS, WM/WS, NM/NS in the figure).

The X2 cross-switch connects four shared Caches via four Master ports, at least one Slave port to a memory controller, and at least one slave port to a configuration module (Xconf) of the cross-switch that is used to configure the address windows of X1 and X2 of this node. Additional memory controllers, I/O ports, can be connected as needed.

1.2. Introduction to Loongson 3A5000/3B5000

The Loongson 3A5000/3B5000 is a quad-core Loongson processor with a stable operating frequency of 2.0-2.5GHz.

The main technical features are as follows:

  • On-chip integration of four 64-bit quad-launch superscalar LA464 processor cores.

  • Peak floating-point computing power 160GFLOPS@2.5GHz.

  • On-chip integration of 16MB of split shared tertiary Cache.

  • Maintenance of Cache consistency for multi-core and I/O DMA accesses via directory protocol.

  • On-chip integration of two 72-bit DDR4 controllers with ECC, supporting DDR4-3200.

  • On-chip integration of two 16-bit HyperTransport controllers (hereinafter referred to as HT) with a maximum bus frequency of 3.2 GHz.

  • Each group of 16-bit HT ports can be split into two groups of 8-bit HT ports for use.

  • 2 I2C, 1 UART, 1 SPI, 16 GPIO interfaces on-chip

The architecture of the Loongson 3A5000/3B5000 is designed to increase the shared Cache capacity based on the 3A4000 and supports 16-way interconnect.

The Loongson 3B5000 supports consistent interconnects on the HT0 interface compared to the 3A5000, with special filtering based on server scenario requirements. There is no difference in the other parts from the hardware and software perspective, and they are collectively referred to as 3A5000.

The overall architecture of the Loongson 3A5000 chip is based on multi-level interconnects and is shown in the figure below.

loongson 3a5000 chip structure
Figure 3. Loongson 3A5000 chip structure

The first level of interconnect uses a 5 × 5 frequency division switch to connect four LA464 cores (as masters), four shared Cache modules (as slaves), and one I/O port to I/O-RING (The I/O port uses one Master and one Slave).

The second level interconnect uses a 5 × 3 cross-switch to connect four shared Cache modules (as masters), two DDR3/4 memory controllers, and one I/O port to the I/O-RING.

The I/O-RING contains 8 ports and the connections include 4 HT controllers, MISC module, SE module and two level cross switches. The two HT controllers (lo/hi) share the 16-bit HT bus, which is used as two 8-bit HT buses, or lo can occupy the 16-bit HT bus exclusively. A DMA controller is integrated into the HT controller, which is responsible for the DMA control of the I/O and the maintenance of inter-chip consistency.

All of these interconnect structures use read/write separated data channels with a 128-bit data channel width operating at the same frequency as the processor core to provide high-speed on-chip data transport. In addition, a one-level cross-switch connects the four processor cores to the SCache with a 256-bit read data channel to increase the read bandwidth of the on-chip processor cores accessing the SCache.

2. System Configuration and Control

2.1. Chip Operating Modes

Depending on the structure of the constituent systems, the Loongson 3A5000 consists of two main operating modes. Single-chip mode: The system contains only 1 chip of Loongson 3A5000, which is a symmetric multiprocessor system (SMP). Multi-chip interconnect mode: The system contains 2, 4, or 16 chips of the Loongson 3A5000 interconnected through HT ports to form a non-uniform access multiprocessor system (CC-NUMA).

2.2. Descriptions of Pins

Main control pins include DO_TEST, ICCC_EN, NODE_ID[2:0], CLKSEL[9:0], and CHIP_CONFIG[5:0].

Table 5. Descriptions of control pins
Signal Pull-up or Pull-down Description

DO_TEST

Pull-up

1’b1 indicates functional mode

1’b0 indicates test mode

ICCC_EN

Pull-down

1’b1 indicates multi-chip coherent interconnect mode

1’b0 indicates single chip mode

NODE_ID[2:0]

Indicates the processor number in multi-chip coherent interconnect mode

CLKSEL[9:0]

Table 1. HT clock control
Signal Description

CLKSEL[9]

1’b1 indicates that HT PLL clock uses CLKSEL[7:6] to control

1’b0 indicates that initial frequency multiplier is 1X and can be reconfigured by software

CLKSEL[8]

1’b1 indicates that HT PLL uses the SYSCLK clock input

1’b0 indicates that HT PLL uses the differential clock input

CLKSEL[7:6]

2’b00 indicates that PHY clock frequency is 1.6GHz

2’b01 indicates that PHY clock frequency is 6.4GHz

2’b10 Reserved

2’b11 indicates that PHY clock frequency is 4.8GHz

CLKSEL[5]

Reserved

CLKSEL[4]

1’b1 - the reference clock is 25MHz

1’b0 - the reference clock is 100MHz

Table 2. MEM clock control (clock frequency should be 1/2 of the interface clock)
CLKSEL[3:2] Output Frequency

2’b00

466MHz

2’b01

600MHz

2’b10

Software configuration (PLL clock multiplier is 1.6-3.2GHz)

2’b11

SYSCLK (100MHz/25MHz)

Table 3. Main clock control (network and maximum processor core frequency)
CLKSEL[1:0] Output Frequency

2’b00

1GHz

2’b01

2GHz

2’b10

Software configuration (PLL clock multiplier is 4.8-6.4GHz)

2’b11

SYSCLK (100MHz/25MHz)

CHIP_CONFIG[5:0]

Table 4. Chip configuration control
CHIP_CONFIG[0] SE function enable

CHIP_CONFIG[1]

Default HT Gen1 Mode

CHIP_CONFIG[2]

NodeID[3]

CHIP_CONFIG[3]

HT1-hi enters consistency mode by default

CHIP_CONFIG[4]

HT1-lo enters consistency mode by default and is used to support 8/16 way interconnects

CHIP_CONFIG[5]

On-chip clock debug enable (DCDL)

3. Physical Address Space Layout

The Loongson 3 Seriesprocessor has a globally accessible hierarchical addressing design for system physical address distribution to ensure extended system development compatibility. The physical address width of the entire system is 48 bits. The entire address space is evenly distributed over 16 nodes according to the high 4 bits of the address, i.e., 44 bits of address space per node.

3.1. Physical Address Space Layout Between Nodes

The Loongson 3A5000 processor can be directly connected with 2/4/8/16 3A5000 chips to build a CC-NUMA system, the processor number of each chip is determined by the pin NODEID, and the address space of each chip is distributed as follows:

Table 6. System global address layout at the node level
Chip Node ID (NODEID) [47:44] Bits of the Address Start address End address

0

0

0x0000_0000_0000

0x0FFF_FFFF_FFFF

1

1

0x1000_0000_0000

0x1FFF_FFFF_FFFF

2

2

0x2000_0000_0000

0x2FFF_FFFF_FFFF

3

3

0x3000_0000_0000

0x3FFF_FFFF_FFFF

4

4

0x4000_0000_0000

0x4FFF_FFFF_FFFF

5

5

0x5000_0000_0000

0x5FFF_FFFF_FFFF

6

6

0x6000_0000_0000

0x6FFF_FFFF_FFFF

7

7

0x7000_0000_0000

0x7FFF_FFFF_FFFF

8

8

0x8000_0000_0000

0x8FFF_FFFF_FFFF

9

9

0x9000_0000_0000

0x9FFF_FFFF_FFFF

10

10

0xA000_0000_0000

0xAFFF_FFFF_FFFF

11

11

0xB000_0000_0000

0xBFFF_FFFF_FFFF

12

12

0xC000_0000_0000

0xCFFF_FFFF_FFFF

13

13

0xD000_0000_0000

0xDFFF_FFFF_FFFF

14

14

0xE000_0000_0000

0xEFFF_FFFF_FFFF

15

15

0xF000_0000_0000

0xFFFF_FFFF_FFFF

When the number of system nodes is less than 16 nodes, the nodemask field of the route setting register (0x1fe00400) should be set to ensure that a response can be obtained even if there is no physical node address when a guessed access occurs (2-way: 0x1; 4-way: 0x3; 8-way: 0x7; 16-way: 0xF).

3.2. Physical Address Space Layout Within Nodes

The Loongson 3A5000 uses a single node 4-core configuration, so the corresponding addresses of the DDR memory controller and HT bus integrated in the Loongson 3A5000 chip are contained in a 44-bit address space from 0x0 (inclusive) to 0x1000_0000_0000 (exclusive). Within each node, the 44-bit address space is further divided among all devices connected within the node, and requests are routed to the four shared Cache modules only when the access type is cached. Depending on the chip and system architecture configuration, if there is no slave device connected on a port, the corresponding address space is reserved address space and access is not allowed.

The address space corresponding to each slave device side of the Loongson 3A5000 chip internal interconnect is as follows:

Table 7. Address layout within nodes
Device the [43:40] bits of the address Start address within nodes End address within nodes

MC0

4

0x400_0000_0000

0x4FF_FFFF_FFFF

MC1

5

0x500_0000_0000

0x5FF_FFFF_FFFF

SE

c

0xC00_0000_0000

0xCFF_FFFF_FFFF

HT0 Lo controller

a

0xA00_0000_0000

0xAFF_FFFF_FFFF

HT0 Hi controller

b

0xB00_0000_0000

0xBFF_FFFF_FFFF

HT1 Lo controller

e

0xE00_0000_0000

0xEFF_FFFF_FFFF

HT1 Hi controller

f

0xF00_0000_0000

0xFFF_FFFF_FFFF

Unlike the directional port mapping relationship, the Loongson 3A5000 can determine the cross-addressing method of the shared Cache based on the access behavior of the actual application. The address space corresponding to the four shared Cache modules in the node is determined based on one or two select bits of the address bits, and can be dynamically configured and modified by software. A configuration register named SCID_SEL is set to determine the address selection bits, as shown in the following table. By default the [7:6] address hash is used for distribution, i.e., the two bits of address [7:6] determine the corresponding shared Cache number. This register is addressed as 0x1fe00400 and can also be accessed using the configuration register instruction (IOCSR).

Table 8. SCID_SEL Address bit configuration
SCID_SEL Address Bit Selection SCID_SEL Address Bit Selection

4’h0

7: 6

4’h8

23:22

4’h1

9: 8

4’h9

25:24

4’h2

11:10

4’ha

27:26

4’h3

13:12

4’hb

29:28

4’h4

15:14

4’hc

31:30

4’h5

17:16

4’hd

33:32

4’h6

19:18

4’he

35:34

4’h7

21:20

4’hf

37:36

The default distribution of the internal 44-bit physical addresses for each node of the Loongson 3A5000 processor is shown in the table below:

Table 9. 44-bit physical address layout within nodes
Address Range Access Properties Destination

addr[43:40]==4’ha

Local node, uncache

HT0_LO

addr[43:40]==4’hb

Local node, uncache

HT0_HI

addr[43:40]==4’hc

Local node, uncache

SE

addr[43:40]==4’he

Local node, uncache

HT1_LO

addr[43:40]==4’hf

Local node, uncache

HT1_HI

0x10000000-0x1fffffff, 0x3ff00000-0x3ff0ffff (can be turned off)

Local node, uncache

MISC

Mc interleave is 0 and is not the above address

Local node, uncache

MC0

Mc interleave is 1 and is not the above address

Local node, uncache

MC1

SCache interleave is 0 (address bit selection is determined by scid_sel)

Local node, Cache

Scache0

SCache interleave is 1 (address bit selection is determined by scid_sel)

Local node, Cache

Scache1

SCache interleave is 2 (address bit selection is determined by scid_sel)

Local node, Cache

Scache2

SCache interleave is 3 (address bit selection is determined by scid_sel)

Local node, Cache

Scache3

3.3. Address Routing Layout and Configuration

The routing of the loongson 3A5000 is mainly implemented through the system’s two-level cross switch with IO-RING. The software can configure the routing of the requests received by each Master port. Each Master port has 8 address windows, and the target routing of 8 address windows can be completed. Each address window consists of three 64-bit registers, BASE, MASK and MMAP, with BASE aligned by K bytes; MASK adopts a format similar to network mask with high bit of 1; the lower four bits of MMAP indicate the number of the corresponding target Slave port; MMAP[4] indicates allow fetch instrustions; MMAP[5] indicates allow block read; MMAP[6] indicates allow interleaved access enable; MMAP[7] indicates window enable.

Table 10. Space access attributes corresponding to MMAP
[7] [6] [5] [4]

Window enable

Allow interleaved access to SCache/memory

Allow to read blocks

Allow to fetch instructions

Window hit formula: (IN_ADDR & MASK) == BASE

Since Loongson 3 uses fixed routing by default, the configuration window is closed at power-up and requires system software to enable it for use.

When the SCache/memory interleaved access configuration is enabled, the slave number is only valid when it is 0 or 4. 0 indicates routing to SCACHE and SCID_SEL determines how interleaved access is performed across the 4 SCaches. 4 indicates routing to memory and interleave_bit determines how interleaved accesses are performed across the 2 MCs.

The address window translation registers are shown in the table below. The base address is 0x1FE0_0000, or accessed via the IOCSR instruction.

Table 11. Table of address window registers
Address Register Address Register

2000

CORE0_WIN0_BASE

2100

CORE1_WIN0_BASE

2008

CORE0_WIN1_BASE

2108

CORE1_WIN1_BASE

2010

CORE0_WIN2_BASE

2110

CORE1_WIN2_BASE

2018

CORE0_WIN3_BASE

2118

CORE1_WIN3_BASE

2020

CORE0_WIN4_BASE

2120

CORE1_WIN4_BASE

2028

CORE0_WIN5_BASE

2128

CORE1_WIN5_BASE

2030

CORE0_WIN6_BASE

2130

CORE1_WIN6_BASE

2038

CORE0_WIN7_BASE

2138

CORE1_WIN7_BASE

2040

CORE0_WIN0_MASK

2140

CORE1_WIN0_MASK

2048

CORE0_WIN1_MASK

2148

CORE1_WIN1_MASK

2050

CORE0_WIN2_MASK

2150

CORE1_WIN2_MASK

2058

CORE0_WIN3_MASK

2158

CORE1_WIN3_MASK

2060

CORE0_WIN4_MASK

2160

CORE1_WIN4_MASK

2068

CORE0_WIN5_MASK

2168

CORE1_WIN5_MASK

2070

CORE0_WIN6_MASK

2170

CORE1_WIN6_MASK

2078

CORE0_WIN7_MASK

2178

CORE1_WIN7_MASK

2080

CORE0_WIN0_MMAP

2180

CORE1_WIN0_MMAP

2088

CORE0_WIN1_MMAP

2188

CORE1_WIN1_MMAP

2090

CORE0_WIN2_MMAP

2190

CORE1_WIN2_MMAP

2098

CORE0_WIN3_MMAP

2198

CORE1_WIN3_MMAP

20a0

CORE0_WIN4_MMAP

21a0

CORE1_WIN4_MMAP

20a8

CORE0_WIN5_MMAP

21a8

CORE1_WIN5_MMAP

20b0

CORE0_WIN6_MMAP

21b0

CORE1_WIN6_MMAP

20b8

CORE0_WIN7_MMAP

21b8

CORE1_WIN7_MMAP

2200

CORE2_WIN0_BASE

2300

CORE3_WIN0_BASE

2208

CORE2_WIN1_BASE

2308

CORE3_WIN1_BASE

2210

CORE2_WIN2_BASE

2310

CORE3_WIN2_BASE

2218

CORE2_WIN3_BASE

2318

CORE3_WIN3_BASE

2220

CORE2_WIN4_BASE

2320

CORE3_WIN4_BASE

2228

CORE2_WIN5_BASE

2328

CORE3_WIN5_BASE

2230

CORE2_WIN6_BASE

2330

CORE3_WIN6_BASE

2238

CORE2_WIN7_BASE

2338

CORE3_WIN7_BASE

2240

CORE2_WIN0_MASK

2340

CORE3_WIN0_MASK

2248

CORE2_WIN1_MASK

2348

CORE3_WIN1_MASK

2250

CORE2_WIN2_MASK

2350

CORE3_WIN2_MASK

2258

CORE2_WIN3_MASK

2358

CORE3_WIN3_MASK

2260

CORE2_WIN4_MASK

2360

CORE3_WIN4_MASK

2268

CORE2_WIN5_MASK

2368

CORE3_WIN5_MASK

2270

CORE2_WIN6_MASK

2370

CORE3_WIN6_MASK

2278

CORE2_WIN7_MASK

2378

CORE3_WIN7_MASK

2280

CORE2_WIN0_MMAP

2380

CORE3_WIN0_MMAP

2288

CORE2_WIN1_MMAP

2388

CORE3_WIN1_MMAP

2290

CORE2_WIN2_MMAP

2390

CORE3_WIN2_MMAP

2298

CORE2_WIN3_MMAP

2398

CORE3_WIN3_MMAP

22a0

CORE2_WIN4_MMAP

23a0

CORE3_WIN4_MMAP

22a8

CORE2_WIN5_MMAP

23a8

CORE3_WIN5_MMAP

22b0

CORE2_WIN6_MMAP

23b0

CORE3_WIN6_MMAP

22b8

CORE2_WIN7_MMAP

23b8

CORE3_WIN7_MMAP

2400

SCACHE0_WIN0_BASE

2500

SCACHE1_WIN0_BASE

2408

SCACHE0_WIN1_BASE

2508

SCACHE1_WIN1_BASE

2410

SCACHE0_WIN2_BASE

2510

SCACHE1_WIN2_BASE

2418

SCACHE0_WIN3_BASE

2518

SCACHE1_WIN3_BASE

2420

SCACHE0_WIN4_BASE

2520

SCACHE1_WIN4_BASE

2428

SCACHE0_WIN5_BASE

2528

SCACHE1_WIN5_BASE

2430

SCACHE0_WIN6_BASE

2530

SCACHE1_WIN6_BASE

2438

SCACHE0_WIN7_BASE

2538

SCACHE1_WIN7_BASE

2440

SCACHE0_WIN0_MASK

2540

SCACHE1_WIN0_MASK

2448

SCACHE0_WIN1_MASK

2548

SCACHE1_WIN1_MASK

2450

SCACHE0_WIN2_MASK

2550

SCACHE1_WIN2_MASK

2458

SCACHE0_WIN3_MASK

2558

SCACHE1_WIN3_MASK

2460

SCACHE0_WIN4_MASK

2560

SCACHE1_WIN4_MASK

2468

SCACHE0_WIN5_MASK

2568

SCACHE1_WIN5_MASK

2470

SCACHE0_WIN6_MASK

2570

SCACHE1_WIN6_MASK

2478

SCACHE0_WIN7_MASK

2578

SCACHE1_WIN7_MASK

2480

SCACHE0_WIN0_MMAP

2580

SCACHE1_WIN0_MMAP

2488

SCACHE0_WIN1_MMAP

2588

SCACHE1_WIN1_MMAP

2490

SCACHE0_WIN2_MMAP

2590

SCACHE1_WIN2_MMAP

2498

SCACHE0_WIN3_MMAP

2598

SCACHE1_WIN3_MMAP

24a0

SCACHE0_WIN4_MMAP

25a0

SCACHE1_WIN4_MMAP

24a8

SCACHE0_WIN5_MMAP

25a8

SCACHE1_WIN5_MMAP

24b0

SCACHE0_WIN6_MMAP

25b0

SCACHE1_WIN6_MMAP

24b8

SCACHE0_WIN7_MMAP

25b8

SCACHE1_WIN7_MMAP

2600

SCACHE2_WIN0_BASE

2700

SCACHE3_WIN0_BASE

2608

SCACHE2_WIN1_BASE

2708

SCACHE3_WIN1_BASE

2610

SCACHE2_WIN2_BASE

2710

SCACHE3_WIN2_BASE

2618

SCACHE2_WIN3_BASE

2718

SCACHE3_WIN3_BASE

2620

SCACHE2_WIN4_BASE

2720

SCACHE3_WIN4_BASE

2628

SCACHE2_WIN5_BASE

2728

SCACHE3_WIN5_BASE

2630

SCACHE2_WIN6_BASE

2730

SCACHE3_WIN6_BASE

2638

SCACHE2_WIN7_BASE

2738

SCACHE3_WIN7_BASE

2640

SCACHE2_WIN0_MASK

2740

SCACHE3_WIN0_MASK

2648

SCACHE2_WIN1_MASK

2748

SCACHE3_WIN1_MASK

2650

SCACHE2_WIN2_MASK

2750

SCACHE3_WIN2_MASK

2658

SCACHE2_WIN3_MASK

2758

SCACHE3_WIN3_MASK

2660

SCACHE2_WIN4_MASK

2760

SCACHE3_WIN4_MASK

2668

SCACHE2_WIN5_MASK

2768

SCACHE3_WIN5_MASK

2670

SCACHE2_WIN6_MASK

2770

SCACHE3_WIN6_MASK

2678

SCACHE2_WIN7_MASK

2778

SCACHE3_WIN7_MASK

2680

SCACHE2_WIN0_MMAP

2780

SCACHE3_WIN0_MMAP

2688

SCACHE2_WIN1_MMAP

2788

SCACHE3_WIN1_MMAP

2690

SCACHE2_WIN2_MMAP

2790

SCACHE3_WIN2_MMAP

2698

SCACHE2_WIN3_MMAP

2798

SCACHE3_WIN3_MMAP

26a0

SCACHE2_WIN4_MMAP

27a0

SCACHE3_WIN4_MMAP

26a8

SCACHE2_WIN5_MMAP

27a8

SCACHE3_WIN5_MMAP

26b0

SCACHE2_WIN6_MMAP

27b0

SCACHE3_WIN6_MMAP

26b8

SCACHE2_WIN7_MMAP

27b8

SCACHE3_WIN7_MMAP

-

-

2900

IO_L2X_WIN0_BASE

-

-

2908

IO_L2X_WIN1_BASE

-

-

2910

IO_L2X_WIN2_BASE

-

-

2918

IO_L2X_WIN3_BASE

-

-

2920

IO_L2X_WIN4_BASE

-

-

2928

IO_L2X_WIN5_BASE

-

-

2930

IO_L2X_WIN6_BASE

-

-

2938

IO_L2X_WIN7_BASE

-

-

2940

IO_L2X_WIN0_MASK

-

-

2948

IO_L2X_WIN1_MASK

-

-

2950

IO_L2X_WIN2_MASK

-

-

2958

IO_L2X_WIN3_MASK

-

-

2960

IO_L2X_WIN4_MASK

-

-

2968

IO_L2X_WIN5_MASK

-

-

2970

IO_L2X_WIN6_MASK

-

-

2978

IO_L2X_WIN7_MASK

-

-

2980

IO_L2X_WIN0_MMAP

-

-

2988

IO_L2X_WIN1_MMAP

-

-

2990

IO_L2X_WIN2_MMAP

-

-

2998

IO_L2X_WIN3_MMAP

-

-

29a0

IO_L2X_WIN4_MMAP

-

-

29a8

IO_L2X_WIN5_MMAP

-

-

29b0

IO_L2X_WIN6_MMAP

-

-

29b8

IO_L2X_WIN7_MMAP

2a00

HT0_LO_WIN0_BASE

2b00

HT0_HI_WIN0_BASE

2a08

HT0_LO_WIN1_BASE

2b08

HT0_HI_WIN1_BASE

2a10

HT0_LO_WIN2_BASE

2b10

HT0_HI_WIN2_BASE

2a18

HT0_LO_WIN3_BASE

2b18

HT0_HI_WIN3_BASE

2a20

HT0_LO_WIN4_BASE

2b20

HT0_HI_WIN4_BASE

2a28

HT0_LO_WIN5_BASE

2b28

HT0_HI_WIN5_BASE

2a30

HT0_LO_WIN6_BASE

2b30

HT0_HI_WIN6_BASE

2a38

HT0_LO_WIN7_BASE

2b38

HT0_HI_WIN7_BASE

2a40

HT0_LO_WIN0_MASK

2b40

HT0_HI_WIN0_MASK

2a48

HT0_LO_WIN1_MASK

2b48

HT0_HI_WIN1_MASK

2a50

HT0_LO_WIN2_MASK

2b50

HT0_HI_WIN2_MASK

2a58

HT0_LO_WIN3_MASK

2b58

HT0_HI_WIN3_MASK

2a60

HT0_LO_WIN4_MASK

2b60

HT0_HI_WIN4_MASK

2a68

HT0_LO_WIN5_MASK

2b68

HT0_HI_WIN5_MASK

2a70

HT0_LO_WIN6_MASK

2b70

HT0_HI_WIN6_MASK

2a78

HT0_LO_WIN7_MASK

2b78

HT0_HI_WIN7_MASK

2a80

HT0_LO_WIN0_MMAP

2b80

HT0_HI_WIN0_MMAP

2a88

HT0_LO_WIN1_MMAP

2b88

HT0_HI_WIN1_MMAP

2a90

HT0_LO_WIN2_MMAP

2b90

HT0_HI_WIN2_MMAP

2a98

HT0_LO_WIN3_MMAP

2b98

HT0_HI_WIN3_MMAP

2aa0

HT0_LO_WIN4_MMAP

2ba0

HT0_HI_WIN4_MMAP

2aa8

HT0_LO_WIN5_MMAP

2ba8

HT0_HI_WIN5_MMAP

2ab0

HT0_LO_WIN6_MMAP

2bb0

HT0_HI_WIN6_MMAP

2ab8

HT0_LO_WIN7_MMAP

2bb8

HT0_HI_WIN7_MMAP

2c00

SE_WIN0_BASE

2d00

MISC_WIN0_BASE

2c08

SE_WIN1_BASE

2d08

MISC_WIN1_BASE

2c10

SE_WIN2_BASE

2d10

MISC_WIN2_BASE

2c18

SE_WIN3_BASE

2d18

MISC_WIN3_BASE

2c20

SE_WIN4_BASE

2d20

MISC_WIN4_BASE

2c28

SE_WIN5_BASE

2d28

MISC_WIN5_BASE

2c30

SE_WIN6_BASE

2d30

MISC_WIN6_BASE

2c38

SE_WIN7_BASE

2d38

MISC_WIN7_BASE

2c40

SE_WIN0_MASK

2d40

MISC_WIN0_MASK

2c48

SE_WIN1_MASK

2d48

MISC_WIN1_MASK

2c50

SE_WIN2_MASK

2d50

MISC_WIN2_MASK

2c58

SE_WIN3_MASK

2d58

MISC_WIN3_MASK

2c60

SE_WIN4_MASK

2d60

MISC_WIN4_MASK

2c68

SE_WIN5_MASK

2d68

MISC_WIN5_MASK

2c70

SE_WIN6_MASK

2d70

MISC_WIN6_MASK

2c78

SE_WIN7_MASK

2d78

MISC_WIN7_MASK

2c80

SE_WIN0_MMAP

2d80

MISC_WIN0_MMAP

2c88

SE_WIN1_MMAP

2d88

MISC_WIN1_MMAP

2c90

SE_WIN2_MMAP

2d90

MISC_WIN2_MMAP

2c98

SE_WIN3_MMAP

2d98

MISC_WIN3_MMAP

2ca0

SE_WIN4_MMAP

2da0

MISC_WIN4_MMAP

2ca8

SE_WIN5_MMAP

2da8

MISC_WIN5_MMAP

2cb0

SE_WIN6_MMAP

2db0

MISC_WIN6_MMAP

2cb8

SE_WIN7_MMAP

2db8

MISC_WIN7_MMAP

2e00

HT1_LO_WIN0_BASE

2f00

HT1_HI_WIN0_BASE

2e08

HT1_LO_WIN1_BASE

2f08

HT1_HI_WIN1_BASE

2e10

HT1_LO_WIN2_BASE

2f10

HT1_HI_WIN2_BASE

2e18

HT1_LO_WIN3_BASE

2f18

HT1_HI_WIN3_BASE

2e20

HT1_LO_WIN4_BASE

2f20

HT1_HI_WIN4_BASE

2e28

HT1_LO_WIN5_BASE

2f28

HT1_HI_WIN5_BASE

2e30

HT1_LO_WIN6_BASE

2f30

HT1_HI_WIN6_BASE

2e38

HT1_LO_WIN7_BASE

2f38

HT1_HI_WIN7_BASE

2e40

HT1_LO_WIN0_MASK

2f40

HT1_HI_WIN0_MASK

2e48

HT1_LO_WIN1_MASK

2f48

HT1_HI_WIN1_MASK

2e50

HT1_LO_WIN2_MASK

2f50

HT1_HI_WIN2_MASK

2e58

HT1_LO_WIN3_MASK

2f58

HT1_HI_WIN3_MASK

2e60

HT1_LO_WIN4_MASK

2f60

HT1_HI_WIN4_MASK

2e68

HT1_LO_WIN5_MASK

2f68

HT1_HI_WIN5_MASK

2e70

HT1_LO_WIN6_MASK

2f70

HT1_HI_WIN6_MASK

2e78

HT1_LO_WIN7_MASK

2f78

HT1_HI_WIN7_MASK

2e80

HT1_LO_WIN0_MMAP

2f80

HT1_HI_WIN0_MMAP

2e88

HT1_LO_WIN1_MMAP

2f88

HT1_HI_WIN1_MMAP

2e90

HT1_LO_WIN2_MMAP

2f90

HT1_HI_WIN2_MMAP

2e98

HT1_LO_WIN3_MMAP

2f98

HT1_HI_WIN3_MMAP

2ea0

HT1_LO_WIN4_MMAP

2fa0

HT1_HI_WIN4_MMAP

2ea8

HT1_LO_WIN5_MMAP

2fa8

HT1_HI_WIN5_MMAP

2eb0

HT1_LO_WIN6_MMAP

2fb0

HT1_HI_WIN6_MMAP

2eb8

HT1_LO_WIN7_MMAP

2fb8

HT1_HI_WIN7_MMAP

The secondary xbar mainly connects 2 memory controllers and IO-RING as slave devices, with 4 SCache (4, representing 4xxx, same as 5, 6, 7) and IO-RING (9) as master devices for window mapping, which can use these window configuration registers (4, 5, 6, 7, 9) for memory window configuration and address translation.

Each address window consists of three 64-bit registers, BASE, MASK, and MMAP, with BASE aligned in K bytes, MASK in a format similar to the network mask high bit 1, and MMAP containing the converted address, routing, and enable control bits, as shown in the following table:

Table 12. Description of MMAP register bit field
[63:48] [47:10] [7:4] [3:0]

Reserved

Address after translation

Window enable

Slave device number

Among them, the devices corresponding to the slave device number are shown in the following table:

Table 13. Correspondence from the device number to the module it belongs to
Slave Device Number Destination Device

0-3

Scache0-3

4-5

MC0-1

a

HT0_lo

b

HT0_hi

c

SE

d

MISC

e

HT1_lo

f

HT1_hi

The meaning of the window enable bits is shown in the table below:

Table 14. Space access attributes corresponding to MMAP
[7] [6] [5] [4]

Window enable

Allow interleaved access to DDR. Valid when the slave device number is 0, to route requests for hit window addresses as configured by the interleave_bit (CSR0x0400). The interleave enable bit is required to be greater than 10

Allow to read blocks

Allow to fetch instructions

Note that the window configuration cannot perform address translation for Cache consistency requests, otherwise the address at the SCache will not match the address at the processor-level Cache, resulting in a Cache consistency maintenance error.

Window hit formula: (IN_ADDR & MASK) == BASE

New address conversion formula: OUT_ADDR = (IN_ADDR & ~MASK) | {MMAP[63:10], 10’h0}

According to the default register configuration, the CPU’s address range of 0x00000000-0x0fffffff after the chip is booted (256M) mapped to the address interval 0x00000000-0x0fffffff of the DDR. 0x10000000-0x17ffffff are mapped to the PCI_MEM space of the bridge chip. 0x18000000-0x19ffffff are mapped to the PCI_IO space of the bridge chip. 0x1a000000-0x1affffff are mapped to the bridge chip’s PCI configuration space (Type0). 0x1b000000-0x1bffffff are mapped to the bridge chip’s PCI configuration space (Type1). 0x40000000-0x7fffffff are mapped to the bridge chip’s PCI_MEM space. Software can implement the new address space routing and translation by modifying the corresponding configuration registers.

In addition, when there is a read access to an illegal address due to CPU guessing execution, all 8 address windows are not hit and random data is returned to prevent the CPU from dying, etc.

4. Chip Configuration Register

The chip configuration registers in the Loongson3A5000 provide a mechanism for reading and writing configuration of various functions of the chip. The individual configuration registers are described in detail below.

The base address of each chip configuration register in this chapter is 0x1fe00000, which can also be accessed using the configuration register instruction (IOCSR).

CSR[A][B] in this document indicates bit B in the IOCSR register with offset address A, where B can be a range.

4.1. Version Register (0x0000)

The offset address is 0x0000.

Table 15. Version register
Bit Field Name Read/Write Reset Value Description

7:0

Version

R

8’h11

Configuration register version number

4.2. Chip Characteristics Register (0x0008)

This register identifies a number of software-related processor features for software to view before enabling a specific function. The offset address is 0x0008.

Table 16. Chip characteristics register
Bit Field Name Read/Write Reset Value Description

0

Centigrade

R

1’b1

1 indicates that CSR[0x428] is valid

1

Node counter

R

1’b1

1 indicates that CSR[0x408] is valid

2

MSI

R

1’b1

1 indicates that MSI is available

3

EXT_IOI

R

1’b1

1 indicates that EXT_IOI is available

4

IPI_percore

R

1’b1

1 indicates that IPI is sent via VSR private address

5

Freq_percore

R

1’b1

1 indicates that frequency is modulated via VSR private address

6

Freq_scale

R

1’b1

1 indicates that dynamic frequency division is available

7

DVFS_v1

R

1’b1

1 indicates that dynamic frequency adjustment v1 is available

8

Tsensor

R

1’b1

1 indicates that temperature sensor is available

9

Interrupt decoding

R

1’b1

Interrupt pin decoding mode is available

10

Flat Mode

R

1’b0

Legacy compatibility mode

11

Guest Mode

WR

1’b0

KVM Virtual Machine Mode

4.3. Manufacturer Name Register (0x0010)

This register is used to identify the vendor name. The offset address is 0x0010.

Table 17. Manufacturer name register
Bit Field Name Read/Write Reset Value Description

63:0

Vendor

R

0x6e6f7367_6e6f6f4c

string “Loongson”

4.4. Chip Name Register (0x0020)

This register is used to identify the chip name. The offset address is 0x0020.

Table 18. Chip name register
Bit Field Name Read/Write Reset Value Description

63:0

ID

R

0x00003030_30354133

String “3A5000”

4.5. Function configuration Register (0x0180)

The offset address is 0x0180.

Table 19. Function configuration register
Bit Field Name Read/Write Reset Value Description

0

RW

1’b0

1

RW

1’b0

3:2

RW

2’b0

Reserved

4

MC0_disable_confspace

RW

1’b0

Whether to disable MC0 DDR configuration space

5

MC0_defult_confspace

RW

1’b1

Routing all memory accesses to the configuration space

6

MCA0 clock en

RW

1’b1

MCA0 clock enable

7

MC0_resetn

RW

1’b1

MC0 software reset (active low)

8

MC0_clken

RW

1’b1

Whether to enable MC0

9

MC1_disable_confspace

RW

1’b0

Whether to disable MC1 DDR configuration space

10

MC1_defult_confspace

RW

1’b1

Routing all memory accesses to the configuration space

11

MCA1 clock en

RW

1’b1

MCA1 clock enable

12

MC1_resetn

RW

1’b1

MC1 software reset (active low)

13

MC1_clken

RW

1’b1

Whether to enable MC1

26:24

HT0_freq_scale_ctrl

RW

3’b011

HT Controller 0 frequency division

27

HT0_clken

RW

1’b1

Whether to enable HT0

30:28

HT1_freq_scale_ctrl

RW

3’b011

HT Controller 1 frequency division

31

HT1_clken

RW

1’b1

Whether to enable HT1

42:40

Node_freq_ctrl

RW

3’b111

Node Frequency Division

43

-

RW

1’b1

63:56

Cpu_version

R

2’h3D

CPU version

4.6. Pin Controller Driver Configuration Register (0x0188)

The offset address is 0x0188.

Table 20. Pin controller driver configuration register
Bit Field Name Read/Write Reset Value Description

15:0

19:16

HT sideband

RW

2’b0

HT control signal driver configuration

23:20

I2C

RW

2’b0

I2C control signal driver configuration

27:24

UART

RW

2’b0

UART control signal driver configuration

31:28

SPI

RW

2’b0

SPI control signal driver configuration

35:32

GPIO

RW

2’b0

GPIO control signal driver configuration

39:36

SE UART

RW

2’b0

SE UART control signal driver configuration

43:40

SE SPI

RW

2’b0

SE SPI control signal driver configuration

47:44

SE I2C

RW

2’b0

SE I2C control signal driver configuration

51:48

SE SCI

RW

2’b0

SE SCI control signal driver configuration

55:52

SE RNG

RW

2’b0

SE RNG control signal driver configuration

59:56

SE GPIO

RW

2’b0

SE GPIO control signal driver configuration

4.7. Function Collection Register (0x0190)

The offset address is 0x0190.

Table 21. Function collection register
Bit Field Name Read/Write Reset Value Description

31:0

R

Reserved

37:32

Chip_config

R

Motherboard configuration control

47:38

Sys_clkseli

R

On-board frequency multiplier configuration

55:48

Bad_ip_core

R

core7-core0 are bad or not

57:56

Bad_ip_ddr

R

2 DDR controllers are bad or not

61:60

Bad_ip_ht

R

2 HT Controllers are bad or not

4.8. Temperature Collection Register (0x0198)

The offset address is 0x0198.

Table 22. Temperature collection register
Bit Field Name Read/Write Reset Value Description

15:0

R

Reserved

19:16

R

Reserved

20

dotest

R

Dotest pin status

21

iccc_en

R

Iccc_en pin status

23:22

R

Reserved

24

Thsens0_overflow

R

Temperature sensor 0 overflow

25

Thsens1_overflow

R

Temperature sensor 1 overflow

31:26

47:32

Thsens0_out

R

Temperature sensor 0 centigrade temperature

Node temperature=Thens0_out *731/0x4000-273

Temperature range: -40 degree - 125 degree

63:48

Thsens1_out

R

Temperature sensor 1 centigrade temperature

Node temperature=Thens0_out *731/0x4000-273

Temperature range: -40 degree - 125 degree

4.9. Frequency Configuration Register (0x01B0)

The following sets of software multiplier setting registers are used to set the operating frequency of the chip master clock and the memory controller clock when CLKSEL is configured in software control mode (refer to the CLKSEL setting method in Descriptions of Pins).

Among other things, the MEM CLOCK configuration supports multiple modes. In 4x mode (mem div), MEM CLOCK should be 4x the memory controller clock; in 2x mode (mem div), MEM CLOCK should be 2x the memory controller clock; in 1x mode (mem div), MEM CLOCK should be the memory controller clock frequency .

The memory bus operates at 2 times the memory controller clock and the bus operates at 4 times the memory controller clock.

NODE CLOCK corresponds to the clock frequency of the processor core, on-chip network, and shared Cache.

Each clock configuration generally has three parameters, DIV_REFC, DIV_LOOPC, and DIV_OUT. The final clock frequency is (reference clock / DIV_REFC * DIV_LOOPC) / DIV_OUT.

In software control mode, the default corresponding clock frequency is the frequency of the external reference clock (100MHz or 25MHz), which needs to be set in software during the processor startup. The process of setting the individual clocks should be done as follows:

  1. Set registers other than SEL_PLL_* and SOFT_SET_PLL, i.e., these two registers are written to 0 during the setting process.

  2. Set registers other than SEL_PLL_* and SOFT_SET_PLL, i.e., these two registers are written to 0 during the setting process.

Wait for the lock signal LOCKED_* in the register to be 1.
  1. Set SEL_PLL_* to 1, then the corresponding clock frequency will be switched to the software-set frequency.

The following register is the configuration register of Main CLOCK, Main Clock is used to generate the maximum operating frequency of node clock, core clock, etc. The base address is 0x1fe00000 and the offset address is 0x1b0:

Table 23. Node clock software multiplier configuration register
Bit Field Name Read/Write Reset Value Description

0

SEL_PLL_NODE

RW

0x0

Clock output selection

1: Node clock select PLL output

0: Node clock select SYS CLOCK

1

RW

0x0

Reserved

2

SOFT_SET_PLL

RW

0x0

Allow software to set the PLL

3

BYPASS_L1

RW

0x0

Bypass L1 PLL

7:4

-

RW

0x0

Reserved

8

VDDA_LDO_EN

RW

0x0

Enable VDDA LDO

9

VDDD_LDO_EN

RW

0x0

Enable VDDD LDO

11:10

-

12

DACPD_L2

RW

0x0

L2 clock DACPD

13

DSMPD_L2

RW

0x0

L2 clock DSMPD

15:14

RW

0x0

Reserved

16

LOCKED_L1

R

0x0

L1 PLL is locked or not

18:17

-

R

0x0

Reserved

19

PD_L1

RW

0x0

Disable L1 PLL

21:20

RW

0x0

Reserved

22

L2_SEL

RW

0x0

Select L2 clock output

25:23

RW

0x0

Reserved

31:26

L1_DIV_REFC

RW

0x1

L1 PLL input parameters

40:32

L1_DIV_LOOPC

RW

0x1

L1 PLL input parameters

41

Reserved

47:42

L1_DIV_OUT

RW

0x1

L1 PLL input parameters

53:48

L2_DIV_REFC

RW

63:54

L2_DIV_LOOPC

RW

69:64

L2_DIV_OUT

RW

95:70

-

RW

119:96

L2_FRAC

RW

122:120

VDDA_LDO_CTRL

RW

123

VDDA_LDO_BYPASS

RW

126:124

VDDD_LDO_CTRL

RW

127:124

VDDD_LDO_BYPASS

RW

Other

-

RW

Reserved

Note: PLL ouput = (clk_ref /div_refc * div_loopc) / div_out.

The result of clk_ref/div_refc for the PLL should be 25/50/100MHz, with 50MHz recommended. The VCO frequency (the part in parentheses in the above equation) must be in the range 4.8GHz-6.4GHz. This requirement also applies to memory PLLs.

In addition, the recommended setting for div_loopc is less than 255. The recommended setting for div_out is 1/2/4/6 and above 6, and 3/5 is not recommended.

The following register is the MEM CLOCK configuration register, the MEM CLOCK clock frequency should be configured to 1/2 of the final DDR bus clock frequency. The base address is 0x1fe00000, offset address is 0x1c0:

Table 24. Memory clock software multiplier configuration register
Bit Field Name Read/Write Reset Value Description

[0]

SEL_MEM_PLL

RW

0x0

Clock output selection

1: Node clock select PLL output

0: Node clock select SYS CLOCK

[1]

SOFT_SET_MEM_PLL

RW

0x0

Allow software to set MEM PLL

[2]

BYPASS_MEM_PLL

RW

0x0

Bypass MEM_PLL

[3]

MEMDIV_RESETn

RW

0x1

Reset the internal frequency divider

[5:4]

MEMDIV_MODE

RW

00: 1X frequency multiplier mode

01: 2X frequency multiplier mode

10: 4X frequency multiplier mode

11: Reserved

[6]

LOCKED_MEM_PLL

R

0x0

MEM_PLL is locked or not

[7]

PD_MEM_PLL

RW

0x0

Disable MEM PLL

[13:8]

MEM_PLL_DIV_REFC

RW

0x1

MEM PLL input parameters

When the NODE clock is selected (NODE_CLOCK_SEL is 1), it is used as a frequency divider input

[23:14]

MEM_PLL_DIV_LOOPC

RW

0x41

MEM PLL input parameters

[29:24]

MEM_PLL_DIV_OUT

RW

0x0

MEM PLL input parameters

[30]

NODE_CLOCK_SEL

RW

0x0

0: use MEM_PLL as MEM clock

1: use NODE_CLOCK as a frequency divider input

[31]

-

[34:32]

VDDA_LDO_CTRL

RW

[35]

VDDA_LDO_BYPASS

RW

[38:36]

VDDD_LDO_CTRL

RW

[39]

VDDD_LDO_BYPASS

RW

[40]

VDDA_LDO_EN

[41]

VDDD_LDO_EN

RW

Other

RW

Reserved

4.10. Processor Core Frequency Division Configuration Register (0x01D0)

The following registers are used for dynamic frequency division of the processor core. Using this register to set the frequency of the processor core, the frequency conversion operation can be done within 100ns with no additional overhead. The base address is 0x1fe00000 and the offset address is 0x01d0.

Table 25. Processor core software frequency division configuration
Bit Field Name Read/Write Reset Value Description

2:0

core0_freqctrl

RW

0x7

Core 0 frequency division control value

3

core0_en

RW

0x1

Core 0 clock enable

6:4

core1_freqctrl

RW

0x7

Core 1 frequency division control value

7

core1_en

RW

0x1

Core 1 clock enable

10:8

core2_freqctrl

RW

0x7

Core 2 frequency division control value

11

core2_en

RW

0x1

Core 2 clock enable

14:12

core3_freqctrl

RW

0x7

Core 3 frequency division control value

15

core3_en

RW

0x1

Core 3 clock enable

Note: The clock frequency value after software dividing is equal to the original (dividing control value + 1)/8.

4.11. Processor Core Reset Control Register (0x01D8)

The following registers are used for software-controlled reset of the processor core. To reset, set resetn to 0, resetn_pre to 0, wait 500 microseconds, resetn_pre to 1, and reset n to 1 to complete the reset process. The base address of this register is 0x1fe00000 and the offset address is 0x01d8.

Table 26. Processor core software reset control register
Bit Field Name Read/Write Reset Value Description

0

Core0_resetn_pre

RW

0x1

Core 0 reset auxiliary control

1

Core0_resetn

RW

0x1

Core 0 reset

2

Core1_resetn_pre

RW

0x1

Core 1 reset auxiliary control

3

Core1_resetn

RW

0x1

Core 1 reset

4

Core2_resetn_pre

RW

0x1

Core 2 reset auxiliary control

5

Core2_resetn

RW

0x1

Core 2 reset

6

Core3_resetn_pre

RW

0x1

Core 3 reset auxiliary control

7

Core3_resetn

RW

0x1

Core 3 reset

4.12. Routing Configuration Register (0x0400)

The following registers are used to control some of the routing settings within the chip. The base address is 0x1fe00000 and the offset address is 0x0400.

Table 27. Chip routing configuration register
Bit Field Name Read/Write Reset Value Description

3:0

scid_sel

RW

0x0

Shared Cache hash bit control

7:4

Node_mask

RW

0xF

Node mask to avoid no response when guessing the address of an unused node

8

xrouter_en

RW

0x0

HT1 inter-chip routing enable control

9

disable_0x3ff0

RW

0x0

Disable routing via base address 0x3ff0_0000 of configuration register space

10

Fast_path_36_en

RW

0x0

Enable 36 fast paths (8-way)

11

Fast_path_27_en

RW

0x0

Enable 27 fast paths (8-way)

12

mcc_en

RW

0x0

MCC mode enable

14

Scahe_1MB

RW

SCache capacity cut in half

19:16

ccsd_id

RW

0x0

24

ccsd_en

RW

0x0

31:30

mc_en

RW

0x3

Enable routing control for both MCs

37:32

interleave_bit

RW

0x0

Memory hash control

39

interleave_en

RW

0x0

Memory hash enable

43:40

ht_control

R

Ht-related configuration pins

47:44

ht_reg_disable

RW

0x0

Close ht space for consistency mode to avoid routing HT space addresses to HT

60:56

Line_ag_cfg

RW

0x0

Cross-chip bandwidth balancing configuration

4.13. Other Function Configuration Register (0x0420)

The following registers are used to control some of the functions enabled within the chip. The base address is 0x1fe00000 and the offset address is 0x0420.

Table 28. Other function configuration register
Bit Field Name Read/Write Reset Value Description

0

disable_jtag

RW

0x0

Completely disable the JTAG interface

1

disable_jtag_LA464

RW

0x0

Completely disable the LA464JTAG debug interface

2

disable_LA132

RW

0x0

Completely disable LA132

3

disable_jtag_LA132

RW

0x0

Completely disable the LA132 JTAG debug interface

4

Disable_antifuse0

RW

0x0

Disable fuse

5

Disable_antifuse1

RW

0x0

Disable fuse

6

Disable_ID

RW

0x0

Disable ID modification

7

Reserved

8

resetn_LA132

RW

0x0

LA132 reset control

9

sleeping_LA132

R

0x0

LA132 go to sleep

10

soft_int_LA132

RW

0x0

LA132 inter-processor interrupt register

15:12

core_int_en_LA132

RW

0x0

LA132 I/O interrupt enable for each core

18:16

freqscale_LA132

RW

0x0

LA132 frequency division control

19

clken_LA132

RW

0x0

LA132 clock enable

20

stable_sel

RW

0x0

Stable clock selection

0: SYS CLOCK

1: NODE CLOCK

21

stable_resetn

RW

0x0

Stable clock reset control

22

freqscale_percore

RW

0x0

Enable private frequency adjustment registers for each core

23

clken_percore

RW

0x0

Enable private clock for each core

27:24

confbus_timeout

RW

0x8

Configure the bus timeout configuration. The actual time is the power of 2.

29:28

HT_softresetn

RW

0x3

HT Controller software reset control

35:32

freqscale_mode_core

RW

0x0

Frequency adjustment mode selection for each core

0: (n+1)/8

1: 1/(n+1)

36

freqscale_mode_node

RW

0x0

Frequency adjustment mode selection for nodes

0: (n+1)/8

1: 1/(n+1)

37

freqscale_mode_LA132

RW

0x0

Frequency adjustment mode selection for LA132

0: (n+1)/8

1: 1/(n+1)

39:38

freqscale_mode_HT

RW

0x0

Frequency adjustment mode selection for each HT

0: (n+1)/8

1: 1/(n+1)

40

freqscale_mode_stable

RW

0x0

Frequency adjustment mode selection for stable clock

0: (n+1)/8

1: 1/(n+1)

43:41

Reserved

46:44

freqscale_stable

RW

0x0

Stable clock frequency adjustment register

47

clken_stable

RW

0x0

Stable clock enable

48

EXT_INT_en

RW

0x0

Extended I/O interrupt enable

49

INT_encode

RW

0x0

Enable interrupt pin encode mode

53:52

54

57:56

thsensor_sel

RW

0x0

Temperature sensor selection

62:60

Auto_scale

R

0x0

Current value auto frequency adjustment

63

Auto_scale_doing

R

0x0

Flags in effect auto frequency adjustment

4.14. Centigrade Temperature Register (0x0428)

The following registers are used to observe the chip internal temperature sensor values. In degrees Celsius. The base address is 0x1fe00000 and the offset address is 0x0428. This register is available only when CSR[0x0008][0] is valid.

Table 29. Temperature observation register
Bit Field Name Read/Write Reset Value Description

7:0

Centigrade temperature

RO

0x0

Centigrade temperature

63:8

RW

0x0

4.15. SRAM Adjustment Register (0x0430)

The following registers are used to adjust the operating frequency of Sram inside the processor core. The offset address is 0x0430.

Table 30. Processor core SRAM adjustment register
Bit Field Name Read/Write Reset Value Description

31:0

sram_ctrl

RW

0x0

Inter-processor sram configuration register

63:32

RW

0x0

4.16. FUSE0 Observation Register (0x0460)

The following registers are used to observe the Fuse0 values visible to some software. The offset address is 0x0460.

Table 31. FUSE0 observation register
Bit Field Name Read/Write Reset Value Description

127:0

Fuse_0

RW

0x0

4.17. FUSE1 Observation Register (0x0470)

The following registers are used to observe the Fuse1 values visible to some software. The offset address is 0x0470.

Table 32. FUSE1 observation register
Bit Field Name Read/Write Reset Value Description

127:0

Fuse_1

RW

0x0

5. Chip Clock Division and Enable Control

The Loongson 3A5000 can use a single external reference clock, SYS_CLOCK. The generation of each clock can depend on SYS_CLOCK, and the following sections describe each of these clocks.

The Loongson 3A5000 has separate frequency dividing mechanisms for the processor core, on-chip network and shared Cache, HT controller, and LA132 core. In line with the 3A4000, the 3A5000 also supports 1/n divider values,It can also be accessed using the configuration register instruction (IOCSR).

The base address of each chip configuration register in this chapter is 0x1fe00000.

5.1. Introduction to Chip Module clock

The chip reference clock SYS_CLOCK usually uses a 100MHz crystal input, but a 25MHz crystal input is also available. Different crystal frequencies need to be selected via CLKSEL[4].

The reference clock of the HT PHY can use the 200MHz differential reference input of each PHY in addition to the SYS CLOCK. Use CLKSEL[8] to make the selection. When SYS CLOCK is selected as the reference clock and a 25MHz crystal input is used, the HT PHY cannot operate at 3.2GHz.

The clocks used in the Loongson 3A5000 chip and their control methods are shown in the following table.

Table 33. Processor internal clock description
Clock Clock Source Frequency Multiplier Method Frequency Division Control Enable Control Clock Description

Boot clock

SYS_CLOCK

*1

Not supported

Not supported

SPI, UART, I2C controller clock

Main clock

SYS PLL

PLL configuration

Not supported

Not supported

SYS PLL output

Node clock, core clock, HTcore clock, LA132 clock source

Optional mem clock, stable clock source

Node clock

Main clock

*1

Supported

Not supported

On-chip network, shared Cache, node clock, HT controller clock source

Core0 clock

Main clock

*1

Supported

Supported

Core0 clock

Core1 clock

Main clock

*1

Supported

Supported

Core1 clock

Core2 clock

Main clock

*1

Supported

Supported

Core2 clock

Core3 clock

Main clock

*1

Supported

Supported

Core3 clock

HTcore0 clock

Node clock

*1

Supported

Supported

HT0 controller clock, and software needs to be guaranteed to be below 1GHz after frequency division

HTcore1 clock

Node clock

*1

Supported

Supported

HT1 controller clock, and software needs to be guaranteed to be below 1GHz after frequency division

LA132 clock

Main clock

*1

Supported

Supported

LA132 clock, and software needs to be guaranteed to be below 1GHz after frequency division

Stable clock

SYS_CLOCK

*1

Supported

Supported

Processor core constant counter clock

Mem clock

MEM PLL

PLL configuration

Not supported

Supported

Memory controller clock

Main clock

/2, /4, /8

Not supported

Supported

Memory controller alternative clock

5.2. Processor Core Frequency Division and Enable Control

There are various modes of processor core frequency division, one is per-address access mode, and the other is processor configuration instruction access mode, which are described below. Each processor core can be controlled separately.

5.2.1. Accessing by Address

The per-address access mode is compatible with the 3A3000 processor and uses the processor core software frequency divider setup register, which uses the same address for setup.

Using this register to set the processor core for tuning allows the frequency conversion operation to be completed in 100ns with no other additional overhead. The base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR),and the offset address is 0x01d0.

Table 34. Processor core software frequency division configuration
Bit Field Name Read/Write Reset Value Description

2:0

core0_freqctrl

RW

0x7

Core 0 frequency division control value

3

core0_en

RW

0x1

Core 0 clock enable

6:4

core1_freqctrl

RW

0x7

Core 1 frequency division control value

7

core1_en

RW

0x1

Core 1 clock enable

10:8

core2_freqctrl

RW

0x7

Core 2 frequency division control value

11

core2_en

RW

0x1

Core 2 clock enable

14:12

core3_freqctrl

RW

0x7

Core 3 frequency division control value

15

core3_en

RW

0x1

Core 3 clock enable

Note: The clock frequency value after software dividing is equal to the original (dividing control value + 1)/8.

In addition to the frequency division configuration compatible with the 3A3000 processor, the clock frequency after frequency division can be adjusted from (Frequency Division Control Value + 1)/8 to 1/(Frequency Division Control Value + 1)` by setting the register in the 3A5000. This register is located in other function configuration register. The base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR),and the offset address is 0x0420.

Table 35. Other function configuration register
Bit Field Name Read/Write Reset Value Description

35:32

freqscale_mode_core

RW

0x0

Frequency adjustment mode selection for each core

0: (n+1)/8

1: 1/(n+1)

5.2.2. Accessing by Configuration Register Instructions

In addition to the legacy per-address access mode, the 3A5000 also supports access to private frequency division configuration registers using the configuration register instruction.

Note that the private frequency division configuration register control is mutually exclusive with the original processor core software frequency division setup register control, and only one of the two can be used. The choice is made by using the corresponding bit on the other function configuration register. This register has a base address of 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and an offset address of 0x0420.

Table 36. Other function configuration register
Bit Field Name Read/Write Reset Value Description

22

freqscale_percore

RW

0x0

Enable private frequency adjustment registers for each core

23

clken_percore

RW

0x0

Enable private clock for each core

When freqscale_percore is set to 1, the freqscale bit in the private divider configuration register is used to set the divider for its own clock (including freqscale_mode). When freqscale_percore is set to 1, the clken_mode bit in the private frequency division configuration register is used to set the clock enable. Bit in the private frequency division configuration register is used to control the clock enable when clken_percore is set to 1.

This configuration register is defined as follows. The offset address is 0x1050.

Table 37. Processor core private frequency division register
Bit Field Name Read/Write Reset Value Description

4

freqscale_mode

RW

0x0

Current processor core frequency division mode selection

0: (n+1)/8

1: 1/(n+1)

3

clken

RW

0x0

Current processor core clock enable

2:0

freqscale

RW

0x0

Current processor core frequency divider configuration

5.3. Node Clock Division and Enable Control

The node clock is the clock used by the on-chip network and shared Cache, and has two different control modes, a software setting mode and a hardware automatic frequency division setting.

The node clock does not support full shutdown, so there is no corresponding clken control bit.

5.3.1. Software Configuration

The software setup method is compatible with the 3A3000 processor and uses the same address to set the node frequency division bits in the Function Setup register.

This register has a base address of 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and an offset address of 0x0180.

Table 38. Function configuration register
Bit Field Name Read/Write Reset Value Description

42:40

Node0_freq_ctrl

RW

3’b111

Node 0 frequency division

In line with the processor core’s dividing control, the node clock can also be adjusted from (dividing control value + 1)/8 to 1/(dividing control value + 1) after dividing by setting the register. This register is located in other function configuration register. The base address is 0x1fe00000 and the offset address is 0x0420.

Table 39. Other function configuration register
Bit Field Name Read/Write Reset Value Description

36

freqscale_mode_node

RW

0x0

Frequency adjustment mode selection for nodes

0: (n+1)/8

1: 1/(n+1)

5.3.2. Hardware Automatic Configuration

In addition to the active setting by the software, the node clock also supports the automatic frequency division setting triggered by the temperature sensor.

The auto-division setting is set by the software in advance for different temperatures, and the corresponding auto-division setting will be triggered when the temperature of the temperature sensor reaches the corresponding preset value.

In order to ensure the operation of the chip in a high-temperature environment, it can be set so that the high temperature automatic frequency reduction, so that the chip in excess of the preset range of active clock division, to achieve the effect of reducing the chip flip rate. See 12.3 for details on how to set it up.

5.4. HT Controller Frequency Division and Enable Control

The frequency division mechanism of the HT controller is similar to the others. The two HT controllers can be controlled separately. The settings are made using the corresponding bits in the function configuration register. Its base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and offset address 0x0180.

Table 40. Function configuration register
Bit Field Name Read/Write Reset Value Description

26:24

HT0_freq_scale_ctrl

RW

3’b111

HT Controller 0 frequency division

27

HT0_clken

RW

1’b1

Whether to enable HT0

30:28

HT1_freq_scale_ctrl

RW

3’b011

HT Controller 1 frequency division

31

HT1_clken

RW

1’b1

Whether to enable HT1

In line with other frequency division controls, the HT controller clock can be adjusted from (frequency division control value + 1)/8 to 1/(frequency division control value + 1) by setting the clock frequency after frequency division through a register. This register is located in other function configuration register. The base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and the offset address is 0x0420.

Note that since the HT core clock is derived from the Node clock, it is also affected by the Node clock frequency division.

Table 41. Other function configuration register
Bit Field Name Read/Write Reset Value Description

39:38

freqscale_mode_HT

RW

0x0

Frequency adjustment mode selection for each HT

0: (n+1)/8

1: 1/(n+1)

5.5. Stable Counter Frequency Ddivision and Enable Control

Stable Counter’s frequency division mechanism is similar to the others. It is set using the corresponding bits in the other function configuration register. Its base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and offset address 0x0420.

Table 42. Other function configuration register
Bit Field Name Read/Write Reset Value Description

20

stable_sel

RW

0x0

Stable clock selection

0: SYS CLOCK

1: NODE CLOCK

21

stable_resetn

RW

0x0

Stable clock reset control

1: set reset state

0: unset software reset

40

freqscale_mode_stable

RW

0x0

Frequency adjustment mode selection for stable clock

0: (n+1)/8

1: 1/(n+1)

46:44

freqscale_stable

RW

0x0

Stable clock frequency adjustment register

47

clken_stable

RW

0x0

Stable clock enable

It should be noted that after stable_reset is set to 0, only the software reset is released. At this time, if GPIO_FUNC_en[13] is 1, the reset of the stable counter is still controlled by GPIO[13] (low valid).

The GPIO output enable register base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), offset address 0x0500.

Table 43. GPIO output enable register
Bit Field Name Read/Write Reset Value Description

31:0

GPIO_OEn

RW

32’hffffffff

GPIO output enable (active low)

63:32

GPIO_FUNC_En

RW

32’hffff0000

GPIO function enable (active low)

6. Software Clock System

Several different levels of usage are defined in the Loongson 3A5000 processor for the clocks used by the system software. Inside the processor core are the legacy counter/compare registers, the stable counter registers, and the chip-level node counter registers.

The following is introductions to stable counter and node counter.

6.1. Stable Counter

The constant clock source in the 3A5000 is called the stable counter, which is a separate master clock from the processor core’s own clock and from the node clock.

In the 3A5000, both the processor core clock and the node clock are derived from the master clock, but both can freely control the number of divisions (see the introduction in the previous chapter), while the clock of the stable counter is derived from the input reference clock and can also be independently divided and does not vary with the frequency of other clocks.

Based on this clock source, a timer and a timer are implemented. This chapter mainly introduces the registers related to the Stable couter.

6.1.1. Configuration Address for Stable Timer

Using the Stable counter clock source, a monotonically increasing timer counter and a timer timer that decrements down from the set value are implemented; each processor core has its own independent Stable counter and Stable timer. When the processor accesses the timer, it can only be accessed by rdhwr, DRDTIME and other specific instructions; when the processor accesses the timer, it can be accessed by address to load/store or by CSR configuration register instructions.

Table 44. Address access method
Name Offset Address Read/Write Description

Core0_timer_config

0x1060

RW

Timer configuration register for processor core 0

Core0_timer_ticks

0x1070

R

Timer remaining value for processor core 0

Core1_timer_config

0x1160

RW

Timer configuration register for processor core 1

Core1_timer_ticks

0x1170

R

Timer remaining value for processor core 1

Core2_timer_config

0x1260

RW

Timer configuration register for processor core 2

Core2_timer_ticks

0x1270

R

Timer remaining value for processor core 2

Core3_timer_config

0x1360

RW

Timer configuration register for processor core 3

Core3_timer_ticks

0x1370

R

Timer remaining value for processor core 3

Table 45. Configuration register instruction access method
Name Offset Address Read/Write Description

percore_timer_config

0x1060

RW

Timer configuration register of the current processor core

percore_timer_ticks

0x1070

R

Timer remaining value for the current processor core

Table 46. Register description
Bit Field Name Read/Write Reset Value Description

timer_config

63

1

RW

0x1

Reset to 1, and should write 1

62

Periodic

RW

0x0

Cycle count enable. When this bit is 1, the timer is automatically reset to the value of the InitVal field in timer_config after decreasing to 0

61

Enable

RW

0x0

General enable. The timer is active when this bit is 1

47:0

InitVal

RW

0x0

The initial value for conducting the countdown

timer_ticks

63:48

0

R

0x0

Value 0

47:0

Ticks

R

0x0

The remaining value of the countdown. When not in a cycle count, the value will stay at 48’hffff_ffff_ffff when the count is complete

6.1.2. Clock Control for Stable Counter

The Stable counter can optionally use either the reference clock input or the master clock and can be controlled by software dividing mechanism for dividing the frequency. In general, it is recommended to use the reference clock input, which is able to be completely free from dynamic frequency tuning interference compared to the master clock.

The following is the clock control register of the Stable counter. This register is located in the control chip other function configuration register. The base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and the offset address 0x0420.

Table 47. Other function configuration register
Bit Field Name Read/Write Reset Value Description

20

stable_sel

RW

0x0

Stable clock selection

0: SYS CLOCK

1: NODE CLOCK

21

stable_resetn

RW

0x0

Stable clock reset control

1: set reset state

0: unset software reset

40

freqscale_mode_stable

RW

0x0

Frequency adjustment mode selection for stable clock

0: (n+1)/8

1: 1/(n+1)

46:44

freqscale_stable

RW

0x0

Stable clock frequency adjustment register

47

clken_stable

RW

0x0

Stable clock enable

After the BIOS has configured the Stable counter clock source, the MCSR section in each processor core needs to be updated to control the values of CPUCFG.0x4 and CPUCFG.0x5. Referring to the description in Instruction set features implemented in 3A5000, CPUCFG.0x4 should be filled with the crystal clock frequency in Hz; CPUCFG.0x5[31:16] should be filled with the dividing factor; CPUCFG.0x5[15:0] should be filled with the multiplication factor. The latter two should be filled in with the help of BIOS for calculation, so that the result of CCFreq*CFM/CFD is equal to the actual frequency of Stable Counter.

6.1.3. Calibration of Stable Counter

In the single-chip case, the Counter difference between each core is within 2 cycles, and no special calibration is needed. In the multi-chip case, there are large differences between different chips, and a special hardware and software calibration mechanism is needed to keep the counter difference of each core below 100ns.

First, to ensure that the master clock of each chip does not deviate during use, the same crystal is used to drive SYS_CLK for all chips.

Second, to ensure that the Stable counter of each chip starts timing at the same moment, the hardware needs to use the multiplexing function of two GPIO pins. Node 0 uses GPIO12 to output the reset signal, and all other nodes (including Node 0) use GPIO13 to input the reset signal (which needs to be configured for the Stable counter function). On the motherboard a buffer device need to be used to ensure the reset timing (mainly the signal slope), the better the reset timing, the less the difference in clocks between different chips.

The software must reset the global Stable counter via GPIO12 before using the Stable counter. Before resetting, need to ensure that the clock selection of each chip is consistent and that the reset of each chip has been lifted. This work is usually done by the BIOS. The connection scheme of the system is shown in the following figure.

stable reset control for multi chip interconnection
Figure 4. Stable reset control for multi-chip interconnection

6.2. Node Counter

The behavior of the Node counter in the Loongson 3A5000 is the same as that of the 3A4000.

It should be noted that the counting frequency of Node counter is exactly the same as Node clock. If you want to use Node counter as the basis for clock calculation, you should avoid inverting Node clock.

6.2.1. Accessing by Address

The per-address access mode is compatible with the 3A3000 processor and uses the same addresses for setup. The base address of the configuration register is `0x1fe00000`can also be accessed using the configuration register instruction (IOCSR), as shown in the table below.

Table 48. Node counter register
Name Offset Address Read/Write Description

Node counter

0x0408

R

64-bit node clock count

6.3. Summary of Clock System

The new Stable counter in Loongson 3A5000 has an advantage over the node counter and CP0counter in terms of stability, as it does not change with other clocks (node clock and core clock).

In terms of ease of use, the Stable counter is also easier to access, using instruction for both user and Guest states. Stable counter is the preferred solution for software reference clock systems.

Node clock is more of a design for legacy compatibility and is a backup solution for the clock system. It will be phased out in future chip designs.

7. GPIO Control

Up to 32 GPIOs are provided in the 3A5000 for system use, and most of them are multiplexed with other functions. The GPIOs can also be configured as interrupt inputs and their interrupt levels can be set through register settings.

The base address of each chip configuration register in this chapter is 0x1fe00000.

7.1. Output Enable Register (0x0500)

The base address is 0x1fe00000 and the offset address is 0x0500.

Table 49. Output enable register
Bit Field Name Read/Write Reset Value Description

31:0

GPIO_OEn

RW

32’hffffffff

GPIO output enable (active low)

63:32

GPIO_FUNC_En

RW

32’hffff0000

GPIO function enable (active low)

7.2. Input/Output Register (0x0508)

The base address is 0x1fe00000 and the offset address is 0x0508.

Table 50. Input/Output Register
Bit Field Name Read/Write Reset Value Description

31:0

GPIO_O

RW

32’h0

GPIO output configuration

63:32

GPIO_I

RO

32’h0

GPIO input status

7.3. Interrupt Control Register (0x0510)

The base address is 0x1fe00000 and the offset address is 0x0510.

Table 51. Interrupt control register
Bit Field Name Read/Write Reset Value Description

31:0

GPIO_INT_Pol

RW

32’h0

GPIO interrupt active level configuration

0 - active low

1 - active high

63:32

GPIO_INT_en

RW

32’h0

GPIO interrupt enable contrl (active high)

7.4. GPIO Pin Function Multiplexing Table

The 3A5000 has a large number of GPIO pins multiplexed with other functions, and the following list shows the pin function selection of the chip function pins.

It should be noted that GPIO00-GPIO15 are GPIO functions when the chip is reset, and the default state is input, not driving I/O.

In order to prevent the internal logic from driving the corresponding I/O, the corresponding HT0/1_Hi/Lo_Hostmode pins can be pulled down. At this point, although the default is still the HT function at reset, it will not drive the I/O pins and will not affect external devices. Only need to set the function to GPIO mode before using the GPIO function in software.

Table 52. Table of GPIO multiplexing function
GPIO Register Pin Name Multiplexing Function Default Function

0

GPIO00

SPI_CSn1

GPIO

1

GPIO01

SPI_CSn2

GPIO

2

GPIO02

UART1_RXD

GPIO

3

GPIO03

UART1_TXD

GPIO

4

GPIO04

UART1_RTS

GPIO

5

GPIO05

UART1_CTS

GPIO

6

GPIO06

UART1_DTR

GPIO

7

GPIO07

UART1_DSR

GPIO

8

GPIO08

UART1_DCD

GPIO

9

GPIO09

UART1_RI

GPIO

10

GPIO10

-

GPIO

11

GPIO11

-

GPIO

12

GPIO12

-

GPIO

13

GPIO13

SCNT_RSTn

GPIO

14

GPIO14

PROCHOTn

GPIO

15

GPIO15

THERMTRIPn

GPIO

16

HT0_LO_POWEROK

GPIO16

HT0_LO_POWEROK

17

HT0_LO_RSTn

GPIO17

HT0_LO_RSTn

18

HT0_LO_LDT_REQn

GPIO18

HT0_LO_LDT_REQn

19

HT0_LO_LDT_STOPn

GPIO19

HT0_LO_LDT_STOPn

20

HT0_HI_POWEROK

GPIO20

HT0_HI_POWEROK

21

HT0_HI_RSTn

GPIO21

HT0_HI_RSTn

22

HT0_HI_LDT_REQn

GPIO22

HT0_HI_LDT_REQn

23

HT0_HI_LDT_STOPn

GPIO23

HT0_HI_LDT_STOPn

24

HT1_LO_POWEROK

GPIO24

HT1_LO_POWEROK

25

HT1_LO_RSTn

GPIO25

HT1_LO_RSTn

26

HT1_LO_LDT_REQn

GPIO26

HT1_LO_LDT_REQn

27

HT1_LO_LDT_STOPn

GPIO27

HT1_LO_LDT_STOPn

28

HT1_HI_POWEROK

GPIO28

HT1_HI_POWEROK

29

HT1_HI_RSTn

GPIO29

HT1_HI_RSTn

30

HT1_HI_LDT_REQn

GPIO30

HT1_HI_LDT_REQn

31

HT1_HI_LDT_STOPn

GPIO31

HT1_HI_LDT_STOPn

7.5. GPIO Interrupt Control

The GPIO pins in the 3A5000 can all be used as interrupt inputs.

GPIO00, GPIO08, GPIO16, GPIO24 share the interrupt controller’s interrupt line 0.

GPIO01, GPIO09, GPIO17, GPIO25 share the interrupt controller’s interrupt line 1.

GPIO02, GPIO10, GPIO18, GPIO26 share the interrupt controller’s interrupt line 2.

GPIO03, GPIO11, GPIO19, GPIO27 share the interrupt controller’s interrupt line 3.

GPIO04, GPIO12, GPIO20, GPIO28 share the interrupt controller’s interrupt line 4.

GPIO05, GPIO13, GPIO21, GPIO29 share the interrupt controller’s interrupt line 5.

GPIO06, GPIO14, GPIO22, GPIO30 share the interrupt controller’s interrupt line 6.

GPIO07, GPIO15, GPIO23, GPIO31 share the interrupt controller’s interrupt line 7.

The interrupt enable of each GPIO is controlled by the configuration register GPIO_INT_en and the interrupt level is controlled by GPIO_INT_POL, the registers are as follows:

The base address is 0x1fe00000 and the offset address is 0x0510.

Table 53. Interrupt control register
Bit Field Name Read/Write Reset Value Description

31:0

GPIO_INT_Pol

RW

32’h0

GPIO interrupt active level configuration

0 - active low

1 - active high

63:32

GPIO_INT_en

RW

32’h0

GPIO interrupt enable contrl (active high)

When each interrupt line on the interrupt controller enables only one of the GPIOs, an interrupt can be triggered at a fixed edge (falling edge when POL is set to 0, rising edge when POL is set to 1) and logged in the interrupt controller using edge triggering.

8. LA464 Processor Core

LA464 is a quad-launch 64-bit high-performance processor core. It can be used as a single core for high-end embedded and desktop applications, or as a basic processor core to form an on-chip multi-core system for server and high-performance machine applications. The multiple LA464 cores in the Loongson 3A5000 form a distributed multi-core architecture with a shared on-chip last-level Cache through the AXI interconnect network. The main features of LA464 are as follows:

  • Support for the Loongson autonomous instruction set (LoongArch).

  • Four-launch superscalar architecture with four fixed-point, two vector, and two access components.

  • Each vector component is 256 bits wide, and each component supports up to eight double 32-bit floating-point multiplication and addition operations.

  • Access components support 256-bit memory access, with 64-bit virtual addresses and 48-bit physical addresses.

  • support for register renaming, dynamic scheduling, transfer prediction, and other chaotic execution techniques.

  • 64 fully-associative items plus 2048 items connected in 8-way groups, for a total of 2112 TLBs, 64 instruction TLBs, and variable page size.

  • First-level instruction Cache and data Cache size of 64KB each, 4-way group concatenation.

  • Victim Cache as a private secondary Cache, 256KB in size, 16-way group concatenation.

  • Supports access optimization techniques such as Non-blocking access and Load-Speculation.

  • Supports Cache Consistency Protocol for on-chip multi-core processors.

  • Supports parity check for first-level Cache and ECC check for second-level, on-chip last-level Cache.

  • Supports standard JTAG debugging interface for easy hardware and software debugging.

The structure of LA464 is shown in the following figure.

la464 structure
Figure 5. LA464 structure

8.1. Instruction set features implemented in 3A5000

The functional features of the Loongson instruction set implemented in the Loongson 3A5000 can be dynamically confirmed by the Loongson instruction set attribute identification mechanism.

The CPUCFG instruction is a user-state instruction, which is used as CPUCFG rd, rj, where the source operand rj register holds the register number of the configuration information word to be accessed, and the returned configuration word information is written to the rd register, each configuration information word contains up to 32 bits of configuration information. For example, bit 0 of configuration word 1 indicates whether the LA32 architecture is implemented, then this configuration information is expressed as CPUCFG.1.LA32[bit0], where 1 means that the font size of the configuration information word is 1, LA32 means that the helper name of this configuration information field is LA32, and bit 0 means that the field LA32 is located in bit 0 of the configuration word. If the configuration information needs to be expressed in multiple bits, then the location information will be recorded in the form of bitAA:BB, which means the consecutive (AA-BB+1) bits from the AAth to the BBth bit of the configuration information word.

The following table gives a list of configuration information for the instruction set functions implemented in the 3A5000. The last column, “Possible Value”, indicates a possible value to be read from this register, but does not imply that this is the value to be read from the 3A5000 processor. Please refer to the results of the actual hardware execution of the instruction, and make subsequent software judgments based on the actual read values.

Table 54. List of configuration information for the instruction set functions implemented in the 3A5000

Register Number

Bit Field

Name

Description

Possible Value

0x0

31:0

PRId

Processor Identity

32’h14_c010

0x1

1:0

ARCH

2’b00 indicates implementation of simplified LA32; 2’b01 indicates implementation of LA32; 2’b10 indicates implementation of LA64. 2’b11 is reserved

2’b10

2

PGMMU

1 indicates that the MMU supports page mapping mode

1’b1

3

IOCSR

1 indicates support for the IOCSR instruction

1’b1

11:4

PALEN

The value of the supported physical address bits PALEN minus 1

8’d47

19:12

VALEN

The value of the supported vitrual address bits VALEN minus 1

8’d47

20

UAL

1 indicates support for non-aligned memory access

1’b1

21

RI

1 indicates support for the “Read Inhibit” page attribute

1’b1

22

EP

1 indicates support for “Execution Protection” page attribute

1’b1

23

RPLV

1 indicates support for RPLV page attributes

1’b1

24

HP

1 indicates support for huge page page attributes

1’b1

25

IOCSR_BRD

1 indicates a string with processor product information recorded at address 0 of the IOCSR access space

1’b1

26

MSG_INT

1 means that the external interrupt uses the message interrupt mode, otherwise it is the level interrupt line mode

1’b0

0x2

0

FP

1 means support for basic floating-point instructions

1’b1

1

FP_SP

1 indicates support for single-precision floating-point numbers

1’b1

2

FP_DP

1 indicates support for double-precision floating-point numbers

1’b1

5:03

FP_ver

The version number of the floating-point arithmetic standard. 1 is the initial version number, indicating that it is compatible with the IEEE 754-2008 standard

3’h1

6

LSX

1 indicates support for 128-bit vector extension

1’b1

7

LASX

1 indicates support for 256-bit vector expansion

1’b1

8

COMPLEX

1 indicates support for complex vector operation instructions

1’b1

14

LLFTP

1 indicates support for constant frequency counter and timer

1’b1

17:15

LLFTP_ver

Constant frequency counter and timer version number. 1 is the initial version

3’h1

21

LSPW

1 indicates support for the software page table walking instruction

1’b1

22

LAM

1 indicates support AM* atomic memory access instruction

1’b1

0x3

0

CCDMA

1 indicates support for hardware Cache coherent DMA

1’b1

1

SFB

1 indicates support for Store Fill Buffer (SFB)

1’b1

3

LLEXC

1 indicates support for LL instruction to fetch exclusive block function

1’b1

4

SCDLY

1 indicates support random delay function after SC

1’b1

5

LLDBAR

1 indicates support LL automatic with dbar function

1’b1

6

ITLBT

1 indicates that the hardware maintains the consistency between ITLB and TLB

1’b1

7

ICACHET

1 indicates that the hardware maintains the data consistency between ICache and DCache in one processor core

1’b1

10:8

SPW_LVL

The maximum number of directory levels supported by the page walk instruction

3’h4

11

SPW_HP_HF

1 indicates that the page walk instruction fills the TLB in half when it encounters a large page

1’b1

12

RVA

1 indicates that the software configuration can be used to shorten the virtual address range

1’b1

16:13

RVAMAX-1

The maximum configurable virtual address is shortened by -1

1’b1

0x5

15:00

CC_MUL

Constant frequency counter and timer and the corresponding multiplication factor of the clock used by the timer

N/A

31:16

CC_DIV

Constant frequency counter and timer and the division coefficient corresponding to the clock used by the timer

N/A

0x6

0

PMP

1 indicates support for the performance counter

1’b1

3:1

PMVER

In the performance monitor, the architecture defines the version number of the event, and 1 is the initial version

3’h1

7:4

PMNUM

Number of performance monitors minus 1

4’h3

13:08

PMBITS

Number of bits of a performance monitor minus 1

6’h3f

14

UPM

1 indicates support for reading performance counter in user mode

1’b1

0x10

0

L1 IU_Present

1 indicates that there is a first-level instruction Cache or a first-level unified Cache

1’b1

1

L1 IU Unify

1 indicates that the Cache shown by L1 IU_Present is the unified Cache

1’b0

2

L1 D Prwsent

1 indicates there is a first-level data Cache

1’b1

3

L2 IU Present

1 indicates there is a second-level instruction Cache or a second-level unified Cache

1’b1

4

L2 IU Unitfy

1 indicates that the Cache shown by L2 IU_Present is the unified Cache

1’b1

5

L2 IU Private

1 indicates that the Cache shown by L2 IU_Present is private to each core

1’b1

6

L2 IU Inclusive

1 indicates that the Cache shown by L2 IU_Present has an inclusive relationship to the lower levels (L1)

1’b0

7

L2 D Present

1 indicates there is a secondary data Cache

1’b0

8

L2 D Private

1 indicates that the secondary data Cache is private to each core

1’b0

9

L2 D Inclusive

1 indicates that the secondary data Cache has a containment relationship to the lower level (L1)

1’b0

10

L3 IU Present

1 indicates there is a three-level instruction Cache or a three-level system Cache

1’b1

11

L3 IU Unify

1 indicates that the Cache shown by L3 IU_Present is unified Cache

1’b1

12

L3 IU Private

1 indicates that the Cache shown by L3 IU_Present is private to each core

1’b0

13

L3 IU Inclusive

1 indicates that the Cache shown by L3 IU_Present has an inclusive relationship to the lower levels (L1 and L2)

1’b1

14

L3 D Present

1 indicates there is a three-level data Cache

1’b0

15

L3 F Inclusive

1 indicates that the three-level data Cache is private to each core

1’b0

16

L3 D Inclusive

1 indicates that the three-level data Cache has an inclusive relationship to the lower levels (L1 and 12)

1’b0

0x11

15:0

Way-1

Number of channels minus 1 (Cache corresponding to L1 IU_Present in configuration word 10)

16’h3

23:16

Index-log2

log2(number of Cache rows per channel) (Cache corresponding to L1 IU_Present in configuration word 10)

8’h8

30:24

Linesize-log2

log2(Cache line bytes) (Cache corresponding to L1 IU_Present in configuration word 10)

8’h6

0x12

15:0

Way-1

Number of channels minus 1 (Cache corresponding to L1 D Present in Configuration Word 10)

16’h3

23:16

Index-log2

log2(number of Cache rows per channel) (Cache corresponding to L1 D Present in Configuration Word 10)

8’h8

30:24

Linesize-log2

log2(Cache row bytes) (Cache corresponding to L1 D Present in configuration word 10)

8’h6

0x13

15:0

Way-1

Number of channels minus 1 (Cache corresponding to L2 IU Present in configuration word 10)

16’hf

23:16

Index-log2

log2(number of Cache rows per channel) (Cache corresponding to L2 IU Present in configuration word 10)

8’h8

30:24

Linesize-log2

log2(Cache row bytes) (Cache corresponding to L2 IU Present in configuration word 10)

8’h6

0x14

15:00

Way-1

Number of channels minus 1 (Cache corresponding to L3 IU Present in configuration word 10)

16’hf

23:16

Index-log2

log2(number of Cache rows per channel) (Cache corresponding to L3 IU Present in configuration word 10)

8’h8

30:24

Linesize-log2

log2(Cache row bytes) (Cache corresponding to L3 IU Present in configuration word 10)

8’h6

8.2. Access to 3A5000 Control and Status Registers

The 3A5000 supports configuration status register space access. The CSR is accessed using a new, independent addressing space called the CSR space that does not overlap with existing register space, memory space, or JTAG space.

CSR read and write accesses are performed via the custom IOCSRRD.B/H/W/D and IOCSRWR. IOCSRRD.B/H/W/D is used as IOCSRRD.B/H/W/D rd,rj, where the source operand rj register holds the address of the CSR with access, and the CSR read back is written to the rd register. IOCSRWR.B/H/W/D is used as IOCSRWR.B/H/W/D rd,rj, where the source operand rj register holds the address of the CSR with access, and the source operand rd register holds the value of the CSR to be written. IOCSRRD.B/H/W/D and IOCSRWR.B/H/W/D are allowed to operate in the kernel state only.

IOCSRRD.B/H/W/D and IOCSRWR.B/H/W/D instructions can be used instead of the original address-mapped configuration registers, i.e., the 0x1fe00000 and 0x3ff00000 spaces, as described in the relevant sections for access details.

9. Shared Cache (SCache)

The SCache module is a three-level Cache shared by all processor cores within the Loongson 3A5000 processor. The main features of the SCache module include:

  • 16-item Cache access queue.

  • Keyword priority.

  • Supports Cache Consistency Protocol through directories.

  • Can be used in on-chip multi-core architectures or directly interfaced with single-processor IP.

  • 16-way group concatenation architecture.

  • Supports ECC checksum.

  • Supports DMA consistent read/write and prefetch reads.

  • Supports 16 types of shared Cache hashing.

  • Supports shared Cache by window lock.

  • Guaranteed read data return atomicity.

The shared Cache module includes the shared Cache management module scachemanage and the shared Cache access module scacheaccess. scachemanage is responsible for processing access requests from processors and DMAs, while the shared Cache TAG, directory and data information is stored in the scacheaccess module. To reduce power consumption, the TAG, directory and data of the shared Cache can be accessed separately. The shared Cache status bits and w bits are stored together with TAG, TAG is stored in TAG RAM, directory is stored in DIR RAM and data is stored in DATA RAM. Failure request accesses the shared Cache and reads the TAG, directory of all roads at the same time and picks the directory according to TAG and reads the data according to the hit. Replacement requests, refill requests and write back requests operate only on the TAGs, directories and data of one way.

To improve the performance of some specific computing tasks, a locking mechanism is added to the shared Cache. Blocks that fall in the locked area of the Shared Cache are locked and therefore will not be replaced out of the Shared Cache (unless all 16 paths of the Shared Cache contain locked blocks).

The four sets of lock window registers inside the shared Cache module can be dynamically configured through the chip configuration register space, but it must be ensured that one of the 16 Shared Caches must not be locked. In addition, when the shared Cache receives a DMA write request, if the area to be written is hit and locked in the shared Cache, then the DMA write will be written directly to the shared Cache.

Table 55. Shared Cache lock window register configuration
Name Address Bit Field Description

Slock0_valid

0x00200

[63:63]

Valid bits for lock window 0

Slock0_addr

0x00200

[47:0]

Lock address for lock window 0

Slock0_mask

0x00240

[47:0]

Mask for lock window 0

Slock1_valid

0x00208

[63:63]

Valid bits for lock window 1

Slock1_addr

0x00208

[47:0]

Lock address for lock window 1

Slock1_mask

0x00248

[47:0]

Mask for lock window 1

Slock2_valid

0x00210

[63:63]

Valid bits for lock window 2

Slock2_addr

0x00210

[47:0]

Lock address for lock window 2

Slock2_mask

0x00250

[47:0]

Mask for lock window 2

Slock3_valid

0x00218

[63:63]

Valid bits for lock window 3

Slock3_addr

0x00218

[47:0]

Lock address for lock window 3

Slock3_mask

0x00258

[47:0]

Mask for lock window 3

For example, when an address addr causes slock0_valid && addr & slock0_mask) == (slock0_addr & slock0_mask to be 1, the address is locked by lock window 0.

The 4 SCache use the same configuration register with base address 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), and offset address 0x0280.

Table 56. Shared Cache configuration register (SC_CONFIG)
Bit Field Name Read/Write Reset Value Description

0

LRU en

RW

1’b1

SCache LRU replacement algorithm enable

16

Prefetch En

RW

1’b1

SCache prefetch function enable

22:20

Prefetch config

RW

3’h1

Stop prefetching when SCache prefetching exceeds the address range of the configured size

0 - 4KB

1 - 16KB

2 - 64KB

3 - 1MB

7 - Unlimited

(Note: Valid when SCID_SEL==0)

26:24

Prefetch lookahead

RW

3’h2

SCache prefetch size

0 - Reserved

1 - 0x100

2 - 0x200

3 - 0x300

4 - 0x400

5 - 0x500

6 - 0x600

7 - 0x700

(Note: Valid when SCID_SEL==0)

30:28

Sc stall dirq cycle

RW

3’h2

Number of clock cycles of SC instruction blocking dirq

0 - 1 cycle (nonstall)

1 - 16-31 cycle random

2 - 32-63 cycle random

3 - 64-127 cycle random

4 - 128-255 cycle random

Other - Invalid

31

MCC storefill en

RW

1’b0

MCC storefill function enable

34:32

35

MCC clean exclusive replace en

RW

1’b0

36

MCC clean shared replace en

RW

1’b0

10. Inter-Processor Interrupts and Communication

The Loongson 3A5000 implements eight inter-processor interrupt registers (IPI) for each processor core to support interrupts and communication between processor cores during multi-core BIOS boot and OS runtime.

Two different access modes are supported in the Loongson 3A5000, one is an address access mode compatible with processors such as the 3A3000, and the other is designed to support direct private access to the processor register space. Separate descriptions are provided in later sections.

10.1. Accessing by Address

For the Loongson 3A5000, the following registers can be accessed using base address 0x1fe0_0000,It can also be accessed using the configuration register instruction (IOCSR). Of these, base address 0x3ff0_0000 can be turned off by the disable_0x3ff0 control bit in the Routing Setup Register. See the tables below for specific register descriptions and addresses.

The registers related to interprocessor interrupts and their functions in the Loongson 3A5000 are described as follows:

Table 58. List of inter-processor interrupt and communication registers for processor core 0
Name Offset Address Read/Write Description

Core0_IPI_Status

0x1000

R

IPI_Status register of processor core 0

Core0_IPI_Enalbe

0x1004

RW

IPI_Enalbe register of processor core 0

Core0_IPI_Set

0x1008

W

IPI_Set register of processor core 0

Core0_IPI_Clear

0x100c

W

IPI_Clear register of processor core 0

Core0_MailBox0

0x1020

RW

IPI_MailBox0 register of processor core 0

Core0_MailBox1

0x1028

RW

IPI_MailBox1 register of processor core 0

Core0_MailBox2

0x1030

RW

IPI_MailBox2 register of processor core 0

Core0_MailBox3

0x1038

RW

IPI_MailBox3 register of processor core 0

Table 59. List of inter-processor interrupt and communication registers for processor core 1
Name Offset Address Read/Write Description

Core1_IPI_Status

0x1100

R

IPI_Status register of processor core 1

Core1_IPI_Enalbe

0x1104

RW

IPI_Enalbe register of processor core 1

Core1_IPI_Set

0x1108

W

IPI_Set register of processor core 1

Core1_IPI_Clear

0x110c

W

IPI_Clear register of processor core 1

Core1_MailBox0

0x1120

R

IPI_MailBox0 register of processor core 1

Core1_MailBox1

0x1128

RW

IPI_MailBox1 register of processor core 1

Core1_MailBox2

0x1130

W

IPI_MailBox2 register of processor core 1

Core1_MailBox3

0x1138

W

IPI_MailBox3 register of processor core 1

Table 60. List of inter-processor interrupt and communication registers for processor core 2
Name Offset Address Read/Write Description

Core2_IPI_Status

0x1200

R

IPI_Status register of processor core 2

Core2_IPI_Enalbe

0x1204

RW

IPI_Enalbe register of processor core 2

Core2_IPI_Set

0x1208

W

IPI_Set register of processor core 2

Core2_IPI_Clear

0x120c

W

IPI_Clear register of processor core 2

Core2_MailBox0

0x1220

R

IPI_MailBox0 register of processor core 2

Core2_MailBox1

0x1228

RW

IPI_MailBox1 register of processor core 2

Core2_MailBox2

0x1230

W

IPI_MailBox2 register of processor core 2

Core2_MailBox3

0x1238

W

IPI_MailBox3 register of processor core 2

Table 61. List of inter-processor interrupt and communication registers for processor core 3
Name Offset Address Read/Write Description

Core3_IPI_Status

0x1300

R

IPI_Status register of processor core 3

Core3_IPI_Enalbe

0x1304

RW

IPI_Enalbe register of processor core 3

Core3_IPI_Set

0x1308

W

IPI_Set register of processor core 3

Core3_IPI_Clear

0x130c

W

IPI_Clear register of processor core 3

Core3_MailBox0

0x1320

R

IPI_MailBox0 register of processor core 3

Core3_MailBox1

0x1328

RW

IPI_MailBox1 register of processor core 3

Core3_MailBox2

0x1330

W

IPI_MailBox2 register of processor core 3

Core3_MailBox3

0x1338

W

IPI_MailBox3 register of processor core 3

The above is a list of inter-processor interrupt-related registers for a single-node multiprocessor system composed of a single Loongson 3A5000 chip. When multiple Loongson 3A5000 chips are interconnected to form a multi-node CC-NUMA system, each node within a chip corresponds to a global system node number, and the IPI register addresses of the processor cores within the node are fixed at an offset from the base address of the node in the table above. For example, the IPI_Status address of processor core 0 of node 0 is 0x1fe01000, while the processor address of processor 0 of node 1 is 0x10001fe01000, and so on.

10.2. Accessing by Configuration Register Instructions

In the Loongson 3A5000, a new processor core direct register access instruction has been added to provide access to configuration registers through private space. In order to use inter-processor interrupt registers more easily, some adjustments are made to the inter-processor interrupt register definitions in this mode.

Table 62. List of inter-processor interrupt and communication registers for the current processor core
Name Offset Address Read/Write Description

perCore_IPI_Status

0x1000

R

IPI_Status register of the current processor core

perCore_IPI_Enalbe

0x1004

RW

IPI_Enalbe register of the current processor core

perCore_IPI_Set

0x1008

W

IPI_Set register of the current processor core

perCore_IPI_Clear

0x100c

W

IPI_Clear register of the current processor core

perCore_MailBox0

0x1020

RW

IPI_MailBox0 register of the current processor core

perCore_MailBox1

0x1028

RW

IPI_MailBox1 register of the current processor core

perCore_MailBox2

0x1030

RW

IPI_MailBox2 register of the current processor core

perCore_MailBox3

0x1038

RW

IPI_MailBox3 register of the current processor core

In order to send inter-processor interrupt requests and MailBox communication to other cores, the following registers are accessed.

Table 63. Processor core inter-processor communication registers
Name Offset Address Read/Write Description

IPI_Send

0x1040

WO

32-bit interrupt distribution register

[31]: wait for completion flag; when set to 1 it will wait for the interrupt to take effect

[30:26]: reserved

[25:16]: processor core number

[15:5]: reserved

[4:0]: interrupt vector number, corresponding to the vector in IPI_Status

Mail_Send

0x1048

WO

64-bit MailBox Cache register

[63:32]: MailBox data

[31]: wait for completion flag; when set to 1 it will wait for the write to take effect

[30:27]: write data mask; each bit indicates that the bytes corresponding to the 32-bit write data will not really be written to the target address, such as 1000b means write the 0-2 bytes, 0000b means write all 0-3 bytes

[26]: reserved

[25:16]: processor core number

[15:5]: reserved

[4:2]: MailBox number

0 - MailBox0 low 32-bit

1 - MailBox0 high 32-bit

2 - MailBox1 low 32-bit

3 - MailBox1 high 32-bit

4 - MailBox2 low 32-bit

5 - MailBox2 high 32-bit

6 - MailBox3 low 32-bit

7 - MailBox4 high 32-bit

[1:0]: reserved

FREQ_Send

0x1058

WO

32-bit frequency enable register

[31]: wait for completion flag; when set to 1 it will wait for the setting to take effect

[30:27]: write data mask; each bit indicates that the bytes corresponding to the 32-bit write data will not really be written to the target address, such as 1000b means write the 0-2 bytes, 0000b means write all 0-3 bytes

[26]: reserved

[25:16]: processor core number

[15:5]: reserved

[4:0]: write to the corresponding processor core private frequency configuration register

CSR[0x1050]

Note that since the Mail_Send register can only send 32 bits of data at a time, it must be split into two transmissions when sending 64 bits of data. Therefore, the target core needs to ensure the integrity of the transport by other software means while waiting for the contents of the Mail_Box. For example, after sending the Mail_Box data, an inter-processor interrupt is used to indicate that the transmission is complete.

10.3. Debug Support for Configuration Register Instructions

The configuration register instruction is in principle used without cross-chip access, but in order to meet the needs for debugging, etc., cross-chip access is supported here by using multiple register addresses. It is worth noting that such registers can only be written, not read.

In addition to IPI_Send, Mail Send, Freq Send mentioned in the previous section, there is also an Any Send register available with the following address.

Table 64. Processor core inter-processor communication registers
Name Offset Address Read/Write Description

ANY_Send

0x1158

WO

64-bit register access register

[63:32]: data being written

[31]: wait for completion flag; when set to 1 it will wait for the interrupt to take effect

[30:27]: write data mask; each bit indicates that the bytes corresponding to the 32-bit write data will not really be written to the target address, such as 1000b means write the 0-2 bytes, 0000b means write all 0-3 bytes

[26]: reserved

[25:16]: destination processor core number

[15:0]: offset address of the register to be written

11. I/O Interrupts

The Loongson 3A5000 chip supports two different interrupt methods. The first is the legacy interrupt method, which is compatible with processors such as the 3A3000, and the second is the new extended I/O interrupt method, which is used to support the interrupt cross-chip and dynamic distribution functions of the HT controller. The following describes each of the two interrupt methods.

11.1. Legacy I/O Interrupts

The legacy interrupts on the Loongson 3A5000 chip support 32 interrupt sources managed in a unified manner as shown in the figure below. Any of the I/O interrupt sources can be configured to enable or disable, how it is triggered, and the target processor core interrupt pin to be routed. Legacy interrupts do not support cross-chip distribution of interrupts, and can only interrupt processor cores within the same processor chip.

interrupt routing of loongson 3a5000 processor
Figure 6. Interrupt routing of Loongson 3A5000 processor

The interrupt-related configuration registers are in the form of bits to control the corresponding interrupt lines, and the interrupt control bit connections and attributes are configured in the following table.

The configuration of interrupt enable (Enable) has three registers: Intenset, Intenclr and Inten. Intenset sets the interrupt enable, the interrupt corresponding to the bit written 1 in the Intenset register is enabled. Intenclr clears the interrupt enable, the interrupt corresponding to the bit written 1 in the Intenclr register is cleared. The Inten register reads the current status of each interrupt enable. The edge-triggered interrupt signal is selected by the Intedge configuration register, with a write of 1 for edge-triggered and a write of 0 for level-triggered. The interrupt handler can clear the interrupt record by using the corresponding bit of Intenclr. Clearing the interrupt will also clear the interrupt enable.

Table 65. Interrupt control register
Bit Field Read/Write (Default Value)

Intedge

Inten

Intenset

Intenclr

Interrupt Source

0

RW (0)

R (0)

RW (0)

RW (0)

GPIO24/16/8/0/SC0

1

RW (0)

R (0)

RW (0)

RW (0)

GPIO25/17/9/1/SC1

2

RW (0)

R (0)

RW (0)

RW (0)

GPIO26/18/10/2/SC2

3

RW (0)

R (0)

RW (0)

RW (0)

GPIO27/19/11/3/SC3

4

RW (0)

R (0)

RW (0)

RW (0)

GPIO28/20/12/4

5

RW (0)

R (0)

RW (0)

RW (0)

GPIO29/21/13/5

6

RW (0)

R (0)

RW (0)

RW (0)

GPIO30/22/14/6

7

RW (0)

R (0)

RW (0)

RW (0)

GPIO31/23/15/7

8

RW (0)

R (0)

RW (0)

RW (0)

I2C0

9

RW (0)

R (0)

RW (0)

RW (0)

I2C1

10

RW (0)

R (0)

RW (0)

RW (0)

UART0

11

RW (0)

R (0)

RW (0)

RW (0)

MC0

12

RW (0)

R (0)

RW (0)

RW (0)

MC1

13

RW (0)

R (0)

RW (0)

RW (0)

SPI

14

RW (0)

R (0)

RW (0)

RW (0)

Thsens

15

RW (0)

R (0)

RW (0)

RW (0)

UART1

23:16

RW (0)

R (0)

RW (0)

RW (0)

HT0[7:0]

31:24

RW (0)

R (0)

RW (0)

RW (0)

HT1[7:0]

Similar to inter-processor interrupts, the base address of I/O interrupts can also be accessed using 0x1fe00000, or through the processor core’s dedicated register configuration instructions.

11.1.1. Accessing by Address

This access is compatible with that of processors such as the 3A3000, where either 0x1fe00000 or 0x3ff00000 can be used for the base address. The 0x3ff00000 base address can be disabled via the disable_0x3ff0 control bit in the Routing Configuration Register.

Table 66. I/O control register address
Name Offset Address Description

Intisr

0x1420

32-bit interrupt status register

Inten

0x1424

32-bit interrupt enable status register

Intenset

0x1428

32-bit set enable register

Intenclr

0x142c

32-bit clear enable register

Intedge

0x1434

32-bit trigger mode register

CORE0_INTISR

0x1440

32-bit interrupt status routed to CORE0

CORE1_INTISR

0x1448

32-bit interrupt status routed to CORE1

CORE2_INTISR

0x1450

32-bit interrupt status routed to CORE2

CORE3_INTISR

0x1458

32-bit interrupt status routed to CORE3

Four processor cores are integrated in the Loongson 3A5000, and the 32-bit interrupt sources described above can be software configured to select the target processor core for the desired interrupt. Further, the interrupt sources can be routed to any of the processor cores INT0 to INT3. Each of the 32 I/O interrupt sources corresponds to an 8-bit routing controller with the format and addresses shown in Description of the interrupt destination processor core routing register and Interrupt destination processor core routing register address. The routing register uses a vector approach to routing, e.g., 0x48 indicates routing to INT2 of processor 3.

Starting with the 3A5000, the interrupt pin routing bits are added in a coded manner and are enabled by the CSR[0x420][49] bit control. When this bit is enabled, the [7:4] bits in the table below change from a bitmap representation to a numeric encoding method. Configurable values 0-7 indicate interrupt pins 0-7. For example, in this mode, 0x28 indicates routing to INT2 on processor 3.

Table 67. Description of the interrupt routing register
Bit Field Description

3:0

Processor core vector number for routing

7:4

Processor core interrupt pin vector number for routing

Table 68. Interrupt routing register address
Name Offset Address Description Name Offset Address Description

Entry0

0x1400

GPIO24/16/8/0

Entry16

0x1410

HT0-int0

Entry1

0x1401

GPIO25/17/9/1

Entry17

0x1411

HT0-int1

Entry2

0x1402

GPIO26/18/10/2

Entry18

0x1412

HT0-int2

Entry3

0x1403

GPIO27/19/11/3

Entry19

0x1413

HT0-int3

Entry4

0x1404

GPIO28/20/12/4

Entry20

0x1414

HT0-int4

Entry5

0x1405

GPIO29/21/13/5

Entry21

0x1415

HT0-int5

Entry6

0x1406

GPIO30/22/14/6

Entry22

0x1416

HT0-int6

Entry7

0x1407

GPIO31/23/15/7

Entry23

0x1417

HT0-int7

Entry8

0x1408

I2C0

Entry24

0x1418

HT1-int0

Entry9

0x1409

I2C1

Entry25

0x1419

HT1-int1

Entry10

0x140a

UART0

Entry26

0x141a

HT1-int2

Entry11

0x140b

MC0

Entry27

0x141b

HT1-int3

Entry12

0x140c

MC1

Entry28

0x141c

HT1-int4

Entry13

0x140d

SPI

Entry29

0x141d

HT1-int5

Entry14

0x140e

Thsens

Entry30

0x141e

HT1-int6

Entry15

0x140f

UART1

Entry31

0x141f

HT1-int7

11.1.2. Accessing by Configuration Register Instructions

In the Loongson 3A5000, the configuration registers can also be accessed through private space using the same access method as the configuration register instruction. The offset address used by the instruction is the same as that accessed through the address. In addition, for the convenience of users, a dedicated private interrupt status register is set for different current interrupt states of each core, as shown in the following table.

Table 69. Processor core private interrupt status register
Name Offset Address Description

perCore_INTISR

0x1010

32-bit interrupt status routing to the current processor core

11.2. Extended I/O Interrupts

In addition to being compatible with the legacy I/O interrupt method, the 3A5000 supports extended I/O interrupts, which are used to distribute 256-bit interrupts on the HT bus directly to each processor core instead of forwarding them through the HT interrupt line, increasing the flexibility of I/O interrupt usage.

Before the core can use the extended I/O interrupt, it needs to enable the corresponding bit in the “Other function configuration register”. This register has a base address of 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR), an offset address of 0x0420.

Table 70. Other function configuration register
Bit Field Name Read/Write Reset Value Description

48

EXT_INT_en

RW

0x0

Extended I/O interrupt enable

In Extended I/O interrupt mode, HT interrupts can be forwarded directly across slices and distributed in rotation. In the current version, up to s extended interrupt vectors can be supported.

11.2.1. Accessing by Address

The following are the associated extended I/O interrupt registers. As with the other configuration registers, the base address can be used as 0x1fe00000, or can be accessed via the processor core’s dedicated register configuration instructions.

Table 71. Extended I/O interrupt enable register
Name Offset Address Description

EXT_IOIen[63:0]

0x1600

Interrupt enable configuration for extended I/O interrupt [63:0]

EXT_IOIen[127:64]

0x1608

Interrupt enable configuration for extended I/O interrupt [127:64]

EXT_IOIen[191:128]

0x1610

Interrupt enable configuration for extended I/O interrupt [191:128]

EXT_IOIen[255:192]

0x1618

Interrupt enable configuration for extended I/O interrupt [255:192]

Table 72. Extended I/O interrupt auto-rotation enable register
Name Offset Address Description

EXT_IOIbounce[63:0]

0x1680

Auto-rotation enable register for extended I/O interrupt [63:0]

EXT_IOIbounce[127:64]

0x1688

Auto-rotation enable register for extended I/O interrupt [127:64]

EXT_IOIbounce[191:128]

0x1690

Auto-rotation enable register for extended I/O interrupt [191:128]

EXT_IOIbounce[255:192]

0x1698

Auto-rotation enable register for extended I/O interrupt [255:192]

Table 73. Extended I/O interrupt interrupt status register
Name Offset Address Description

EXT_IOIsr[63:0]

0x1700

Interrupt status for extended I/O interrupt [63:0]

EXT_IOIsr[127:64]

0x1708

Interrupt status for extended I/O interrupt [127:64]

EXT_IOIsr[191:128]

0x1710

Interrupt status for extended I/O interrupt [191:128]

EXT_IOIsr[255:192]

0x1718

Interrupt status for extended I/O interrupt [255:192]

Table 74. Extended I/O interrupt status register for each processor core
Name Offset Address Description

CORE0_EXT_IOIsr[63:0]

0x1800

Interrupt status of extended I/O interrupt [63:0] routed to processor core 0

CORE0_EXT_IOIsr[127:64]

0x1808

Interrupt status of extended I/O interrupt [127:64] routed to processor core 0

CORE0_EXT_IOIsr[191:128]

0x1810

Interrupt status of extended I/O interrupt [191:128] routed to processor core 0

CORE0_EXT_IOIsr[255:192]

0x1818

Interrupt status of extended I/O interrupt [255:192] routed to processor core 0

CORE1_EXT_IOIsr[63:0]

0x1900

Interrupt status of extended I/O interrupt [63:0] routed to processor core 1

CORE1_EXT_IOIsr[127:64]

0x1908

Interrupt status of extended I/O interrupt [127:64] routed to processor core 1

CORE1_EXT_IOIsr[191:128]

0x1910

Interrupt status of extended I/O interrupt [191:128] routed to processor core 1

CORE1_EXT_IOIsr[255:192]

0x1918

Interrupt status of extended I/O interrupt [255:192] routed to processor core 1

CORE2_EXT_IOIsr[63:0]

0x1A00

Interrupt status of extended I/O interrupt [63:0] routed to processor core 2

CORE2_EXT_IOIsr[127:64]

0x1A08

Interrupt status of extended I/O interrupt [127:64] routed to processor core 2

CORE2_EXT_IOIsr[191:128]

0x1A10

Interrupt status of extended I/O interrupt [191:128] routed to processor core 2

CORE2_EXT_IOIsr[255:192]

0x1A18

Interrupt status of extended I/O interrupt [255:192] routed to processor core 2

CORE3_EXT_IOIsr[63:0]

0x1B00

Interrupt status of extended I/O interrupt [63:0] routed to processor core 3

CORE3_EXT_IOIsr[127:64]

0x1B08

Interrupt status of extended I/O interrupt [127:64] routed to processor core 3

CORE3_EXT_IOIsr[191:128]

0x1B10

Interrupt status of extended I/O interrupt [191:128] routed to processor core 3

CORE3_EXT_IOIsr[255:192]

0x1B18

Interrupt status of extended I/O interrupt [255:192] routed to processor core 3

Similar to legacy I/O interrupts, the 256-bit interrupt source for Extended I/O interrupts can be software-configured to select the target processor core for the desired interrupt.

However, the interrupt sources are not individually selected to route to any of the processor core interrupts INT0 through INT3, but rather the routing of INT interrupts is done in groups. The following are the interrupt pin routing registers configured by group.

Starting with the 3A5000, the interrupt pin routing bits have been added in a coded manner and are enabled by the CSR[0x420][49] bit control. When this bit is enabled, the [3:0] bits in the table below changes from a bitmap representation to a numeric encoding method. Configurable values 0-7 indicate interrupt pins 0-7. For example, in this mode, 0x2 indicates routing to INT2.

Table 75. Description of the interrupt pin routing register
Bit Field Description

3:0

Processor core interrupt pin vector number for routing

7:4

Reserved

Table 76. Interrupt routing register address
Name Offset Address Description

EXT_IOImap0

0x14C0

Pin routing method of EXT_IOI[31:0]

EXT_IOImap1

0x14C1

Pin routing method of EXT_IOI[63:32]

EXT_IOImap2

0x14C2

Pin routing method of EXT_IOI[95:64]

EXT_IOImap3

0x14C3

Pin routing method of EXT_IOI[127:96]

EXT_IOImap4

0x14C4

Pin routing method of EXT_IOI[159:128]

EXT_IOImap5

0x14C5

Pin routing method of EXT_IOI[191:160]

EXT_IOImap6

0x14C6

Pin routing method of EXT_IOI[223:192]

EXT_IOImap7

0x14C7

Pin routing method of EXT_IOI[255:224]

Each interrupt source additionally corresponds to an 8-bit routing controller with the format and address shown in Description of the interrupt destination processor core routing register and Interrupt destination processor core routing register address. The [7:4] bits are used to select the real node routing vector in Interrupt destination node mapping method configuration. The routing register uses a vector approach for routing, e.g., 0x48 indicates a route to processor core 3 of the node referred to by EXT_IOI_node_type4.

Table 77. Description of the interrupt destination processor core routing register
Bit Field Description

3:0

Processor core vector number for routing

7:4

Selection of node mapping method of routing (as configured in Interrupt destination node mapping method configuration)

Note that when using the rotating distribution mode (corresponding to an EXT_IOIbounce of 1), rotate on the fully mapped mode of node number to processor core number. The setting of EXT_IOIbounce should follow the associated route mapping configuration.

For example, when the setting in the tables above is 0x27 and the setting of EXT_IOI_node_type2 in the the tables below is 0x0013, the interrupt will rotate in turn on node 0 core 0, node 0 core 1, node 0 core 2, node 1 core 0, node 1 core 1, node 1 core 2, node 4 core 0, node 4 core 1, and node 4 core 2.

When using fixed distribution mode (corresponding to an EXT_IOIbounce of 0), only one bit on the bitmap of the node number is allowed to be 1, or all 0 values, corresponding to local triggering.

Table 78. Interrupt destination processor core routing register address
Name Offset Address Description

EXT_IOImap_Core0

0x1C00

Processor core routing method of EXT_IOI[0]

EXT_IOImap_Core1

0x1C01

Processor core routing method of EXT_IOI[1]

EXT_IOImap_Core2

0x1C02

Processor core routing method of EXT_IOI[2]

…​…​

EXT_IOImap_Core254

0x1CFE

Processor core routing method of EXT_IOI[254]

EXT_IOImap_Core255

0x1CFF

Processor core routing method of EXT_IOI[255]

Table 79. Interrupt destination node mapping method configuration
Name Offset Address Description

EXT_IOI_node_type0

0x14A0

Mapping vector type 0 for 16 nodes (software configuration)

EXT_IOI_node_type1

0x14A2

Mapping vector type 1 for 16 nodes (software configuration)

EXT_IOI_node_type2

0x14A4

Mapping vector type 2 for 16 nodes (software configuration)

…​…​

EXT_IOI_node_type15

0x14BE

Mapping vector type 15 for 16 nodes (software configuration)

11.2.2. Accessing by Configuration Register Instructions

The biggest difference when accessing using the processor core’s configuration register instructions is that access to the processor core’s interrupt status registers becomes private, and each core only needs to issue a query request to the same address to get the current core’s interrupt status.

Table 80. Extended I/O interrupt status register for the current processor core
Name Offset Address Description

perCore_EXT_IOIsr[63:0]

0x1800

Interrupt status of the extended I/O interrupt [63:0] routed to the current processor core

perCore_EXT_IOIsr[127:64]

0x1808

Interrupt status of the extended I/O interrupt [127:64] routed to the current processor core

perCore_EXT_IOIsr[191:128]

0x1810

Interrupt status of the extended I/O interrupt [191:128] routed to the current processor core

perCore_EXT_IOIsr[255:192]

0x1818

Interrupt status of the extended I/O interrupt [255:192] routed to the current processor core

11.2.3. Extended I/O Interrupt Trigger Register

To support the dynamic distribution of extended I/O interrupts, an extended I/O interrupt trigger register is added to the configuration register to set the corresponding I/O interrupts to be set. This register can be used for debugging or testing interrupts in normal times.

The description of this register is as follows:

Table 81. Extended I/O interrupt trigger register
Name Offset Address Read/Write Description

EXT_IOI_send

0x1140

WO

Extended I/O interrupt setting register

[7:0] is the interrupt vector expected to be set

11.2.4. Difference in Handling Between Extended I/O Interrupts and Legacy HT Interrupts

With legacy HT interrupt processing, HT interrupts are processed internally by the HT controller and mapped directly to the 256 interrupt vectors on the HT configuration registers, and then the 256 interrupt vectors are grouped to generate 4 or 8 interrupts that are routed to the various processor cores. Due to the legacy interrupt line connection, no cross-chip interrupts can be generated directly, so all HT I/O interrupts can only be handled directly by a single chip. On the other hand, the interrupts distributed by the hardware within the chip are only in units of the final 4 or 8 interrupts and cannot be handled on a bit-by-bit basis, which leads to the problem of poor hardware interrupt distribution.

With the extended I/O interrupt method, HT interrupts are sent directly from the HT controller to the chip’s interrupt controller for processing, and the interrupt controller can directly get 256 Instead of the previous 4 or 8 interrupts, each of these 256-bit interrupts can be routed and distributed independently, and can be distributed and rotated across slices.

With Extended I/O interrupts, the software processing is slightly different than with legacy HT interrupts.

With legacy HT interrupts, the kernel looks directly at the interrupt vector of the HT controller (typically 0x90000efdfb000080) and then processes the interrupts by bit, regardless of how the routing mode is configured.

After using Extended I/O interrupts, the cores go directly to the Extended I/O status register (configuration space 0x1800) to read the interrupt status for processing. Each core will only read the interrupt’s own interrupt status and process it, and there will be no interference between different cores.

12. Temperature Sensor

12.1. Real-time Temperature Collection

Two temperature sensors are integrated inside the Loongson 3A5000, which can be observed through the sampling register starting at 0x1FE00198, and can be controlled using the flexible high and low temperature interrupt alarm or auto-tuning function. The corresponding bits of the temperature sensors in the sampling register are as follows (base address is 0x1FE00000, offset address is 0x0198):

Table 82. Description of temperature collection register
Bit Field Name Read/Write Reset Value Description

24

Thsens0_overflow

R

Temperature sensor 0 overflow

25

Thsens1_overflow

R

Temperature sensor 1 overflow

47:32

Thsens0_out

R

Temperature sensor 0 centigrade temperature

Node temperature=Thens0_out *731/0x4000-273

Temperature range: -40 degree - 125 degree

65:48

Thsens1_out

R

Temperature sensor 1 centigrade temperature

Node temperature=Thens0_out *731/0x4000-273

Temperature range: -40 degree - 125 degree

The control registers can be set to enable over preset temperature interrupt, under preset temperature interrupt and high temperature auto down function.

In addition, the current centigrade temperature can be read directly using the new centigrade temperature register. This register can also be accessed using a read operation with base address 0x1FE00000 or 0x3FF00000, or directly using a configuration register instruction with offset 0x0428. The register is described as follows:

Table 83. Extended I/O interrupt trigger register
Name Offset Address Read/Write Description

Thsens_Temperature

0x0428

R

Temperature sensor centigrade temperature

12.2. High/Low Temperature Interrupt Trigger

For the high and low temperature interrupt alarm function, there are 4 groups of control registers to set the threshold value. Each group of registers contains the following three control bits:

GATE: Set the threshold value for high or low temperature. When the input temperature is higher than the high temperature threshold or lower than the low temperature threshold, an interrupt will be generated. Note that the Gate value should be set to the 16-bit value corresponding to the 0x198 register, not the centigrade temperature.

EN: Interrupt enable control. The setting of this set of registers is valid only after setting 1.

SEL: Input temperature selection. This register is used to configure which sensor’s temperature is selected as input. Either 0 or 1 can be used.

The high temperature interrupt control register contains four sets of setting bits to control the triggering of high temperature interrupts; the low temperature interrupt control register contains four sets of setting bits to control the triggering of low temperature interrupts. There is another set of registers for displaying the interrupt status, corresponding to the high-temperature interrupt and low-temperature interrupt, respectively, and any write operation to this register will clear the interrupt status.

These registers are described below, and their base addresses are 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR):

Table 84. Description of high/low temperature interrupt register
Register Address Read/Write Description

High temperature interrupt control register Thsens_int_ctrl_Hi

0x1460

RW

[7:0]: Hi_gate0: high temperature threshold 0, above which an interrupt will be generated

[8:8]: Hi_en0: high temperature interrupt enable 0

[11:10]: Hi_Sel0: select temperature sensor input source for high temperature interrupt 0

[23:16]: Hi_gate1: high temperature threshold 1, above which an interrupt will be generated

[24:24]: Hi_en1: high temperature interrupt enable 1

[27:26]: Hi_Sel1: select temperature sensor input source for high temperature interrupt 1

[39:32]: Hi_gate2: high temperature threshold 2, above which an interrupt will be generated

[40:40]: Hi_en2: high temperature interrupt enable 2

[43:42]: Hi_Sel2: select temperature sensor input source for high temperature interrupt 2

[55:48]: Hi_gate3: high temperature threshold 3, above which an interrupt will be generated

[56:56]: Hi_en3: high temperature interrupt enable 3

[59:58]: Hi_Sel3: select temperature sensor input source for high temperature interrupt 3

Low temperature interrupt control register Thsens_int_ctrl_Lo

0x1468

RW

[7:0]: Lo_gate0: low temperature threshold 0, below which an interrupt will be generated

[8:8]: Lo_en0: low temperature interrupt enable 0

[11:10]: Lo_Sel0: select temperature sensor input source for low temperature interrupt 0

[23:16]: Lo_gate1: low temperature threshold 1, below which an interrupt will be generated

[24:24]: Lo_en1: low temperature interrupt enable 1

[27:26]: Lo_Sel1: select temperature sensor input source for low temperature interrupt 1

[39:32]: Lo_gate2: low temperature threshold 2, below which an interrupt will be generated

[40:40]: Lo_en2: low temperature interrupt enable 2

[43:42]: Lo_Sel2: select temperature sensor input source for low temperature interrupt 2

[55:48]: Lo_gate3: low temperature threshold 3, below which an interrupt will be generated

[56:56]: Lo_en3: low temperature interrupt enable 3

[59:58]: Lo_Sel3: select temperature sensor input source for low temperature interrupt 3

Interrupt status register Thsens_int_status/clr

0x1470

RW

Interrupt status register; write 1 to clear the interrupt

[0]: high temperature interrupt trigger

[1]: low temperature interrupt trigger

High order bits of high temperature interrupt control register Thsens_int_up

0x1478

RW

[7:0]: Hi_gate0: high 8-bit

[15:8]: Hi_gate1: high 8-bit

[23:16]: Hi_gate2: high 8-bit

[31:24]: Hi_gate3: high 8-bit

[39:32]: Lo_gate0: high 8-bit

[47:40]: Lo_gate1: high 8-bit

[55:48]: Lo_gate2: high 8-bit

[63:56]: Lo_gate3: high 8-bit

12.3. High Temperature Automatic Underclock Configuration

In order to ensure the operation of the chip in a high temperature environment, it can be set to make the high temperature automatic frequency reduction, so that the chip is actively clocked when it exceeds the preset range to achieve the effect of reducing the chip flip rate.

For the high-temperature downconversion function, there are four sets of control registers to set its behavior. Each set of registers contains the following four control bits:

GATE: Set the threshold value for high or low temperature. When the input temperature is higher than the high temperature threshold or lower than the low temperature threshold, the frequency dividing operation will be triggered.

EN: Enable control. The setting of this group of registers is valid only after setting 1.

SEL: Input temperature selection. This register is used to configure which sensor’s temperature is selected as input.

FREQ: Frequency division number. When the dividing operation is triggered, the clock is divided using the preset FREQ. The dividing mode is controlled by freqscale_mode_node.

Its base address is 0x1fe00000,It can also be accessed using the configuration register instruction (IOCSR).

Table 85. Description of high-temperature underclock control register
Register Address Read/Write Description

High-temperature underclock control register Thsens_freq_scale

0x1480

RW

The four groups of configurations are prioritized from highest to lowest

[7:0]: Scale_gate0: high temperature threshold 0, beyond which the frequency will be reduced

[8:8]: Scale_en0: high temperature underclock enable 0

[11:10]: Scale_Sel0: select temperature sensor input source for high temperature underclock 0

[14:12]: Scale_freq0: frequency division value at underclock

[23:16]: Scale_gate1: high temperature threshold 1, beyond which the frequency will be reduced

[24:24]: Scale_en1: high temperature underclock enable 1

[27:26]: Scale_Sel1: select temperature sensor input source for high temperature underclock 1

[30:28]: Scale_freq1: frequency division value at underclock

[39:32]: Scale_gate2: high temperature threshold 2, beyond which the frequency will be reduced

[40:40]: Scale_en2: high temperature underclock enable 2

[43:42]: Scale_Sel2: select temperature sensor input source for high temperature underclock 2

[46:44]: Scale_freq2: frequency division value at underclock

[55:48]: Scale_gate3: high temperature threshold 3, beyond which the frequency will be reduced

[56:56]: Scale_en3: high temperature underclock enable 3

[59:58]: Scale_Sel3: select temperature sensor input source for high temperature underclock 3

[62:60]: Scale_freq3: frequency division value at underclock

Thsens_freq_scale_up

0x1490

RW

High order bits of temperature sensor control register

[7:0]: Scale_Hi_gate0: high 8-bit

[15:8]: Scale_Hi_gate1: high 8-bit

[23:16]: Scale_Hi_gate2: high 8-bit

[31:24]: Scale_Hi_gate3: high 8-bit

[39:32]: Scale_Lo_gate0: high 8-bit

[47:40]: Scale_Lo_gate1: high 8-bit

[55:48]: Scale_Lo_gate2: high 8-bit

[63:56]: Scale_Lo_gate3: high 8-bit

12.4. Temperature Status Detection and Control

The pins PROCHOTn and THERMTRIPn are used for temperature status detection and control, which are multiplexed with GPIO14 and GPIO15 respectively. PROCHOTn can be used as both input and output, while THERMTRIPn has only output function. When PROCHOTn is used as an input, the chip is controlled by the external temperature detection circuit, and the external temperature detection circuit can set PROCHOTn to 0 when it needs to lower the chip temperature, and the chip will take down frequency measures after receiving this low level. When PROCHOTn is an output, the chip can output high-temperature interrupts, and select one of the four interrupts set by the high-temperature interrupt control register through the prochotn_o_sel register. Select one of the four interrupts set in the high-temperature interrupt control register as the external high-temperature interrupt.

THERMTRIPn as output is selected by the chip from the 4 interrupts set by the high-temperature interrupt control register through the thermtripn_o_sel register as the outgoing high-temperature interrupt.

Although both THERMTRIPn and PROCHOTn are external high temperature interrupts, THERMTRIPn has a higher degree of urgency than PROCHOTn. When PROCHOTn is set, the external temperature control circuit can also take certain measures, such as increasing the fan speed. In contrast, when THERMTRIPn is set, the external power control circuitry should take direct emergency power-off measures.

The specific control registers are as follows:

Table 86. Description of temperature status detection and control register
Register Address Read/Write Description

Temperature status detection and control register Thsens_hi_ctrl

0x1498

RW

[0:0]: prochotn_oe: PROCHOTn pin output enable control, 0 for output, 1 for input

[5:4]: prochotn_o_sel: PROCHOTn high temperature interrupt output selection

[10:8]: prochotn_freq_scale: PROCHOTn frequency division value when input is valid

[17:16]: thermtripn_o_sel: THERMTRIPn high temperature interrupt output selection

12.5. Control of temperature sensors

The 3A5000 has 4 internal temperature sensors, which can be configured via registers to adjust the temperature/voltage monitoring, monitoring point configuration and monitoring frequency, etc. The output of each temperature sensor can also be directly observed for debugging (base address is 0x1FE00000,It can also be accessed using the configuration register instruction (IOCSR), offset address of temperature sensor configuration register is 0x01580+vtsensor_id<<4, offset address of temperature sensor data register is 0x01588+vtsensor_id<<4).

Note that the voltage monitoring function is currently not available

Table 87. Description of temperature sensor configuration register
Bit Field Name Read/Write Reset Value Description

0

Thsens_trigger

RW

0

Enable temperature sensor configuration. If set, monitoring mode and monitoring point can be selected by thsens_mode and thsens_cluster; 0 is the default temperature monitoring mode and the monitoring point is configured by temp_cluster

2

Thsens_mode

RW

0

0: temperature mode; 1: voltage mode

3

Thsens_datarate

RW

0

Monitoring frequency:

0 - 10-20Hz

1 - 325-650Hz

6:4

Thsens_cluster

RW

0

Sensor monitoring point configuration: 0 is local monitoring point, 1-7 is remote monitoring point

8

Temp_valid

RW

0

Enable the temperature sensor output and replace the value of Thsens0_out and Thsens0_overflow in CSR[0x198] with the temperature monitoring value of this temperature sensor.

11:9

Temp_cluster

RW

0

Temperature sensor output monitoring point selection. It is disabled when Thsens_trigger is enabled

Table 88. Description of temperature sensor data register
Bit Field Name Read/Write Reset Value Description

3

Out_mode

R

0

Monitoring mode for sensor configuration

0: temperature mode; 1: voltage mode

6:4

Out_cluster

R

0

Monitoring points for sensor configuration

7

Overflow

R

0

Overflow of sensor monitoring values

29:16

Data

R

0

Sensor readout monitoring values

Calculation of the readout value:

Node temperature = data*731/0x4000 - 273 (temperature range -40 degrees - 125 degrees)

Voltage = data*1.226/0x1000

13. DDR4 SDRAM Controller Configuration

The Loongson 3A5000 processor’s internally integrated memory controller is designed to comply with the DDR4 SDRAM industry standard (JESD79-4).

13.1. Introduction to DDR4 SDRAM Controller Functions

The Loongson 3A5000 processor supports both DDP and 3DS packaging modes. The DDP supports up to 8 CSs (implemented by 8 DDR3/DDR4 SDRAM chip select signals, i.e., 4 double-sided memory sticks) and the 3DS supports up to 4 CSs (implemented by 8 DDR4 SDRAM chip select signals, i.e., 32 logical RANKS). A total of 22 bits of address bus (i.e., 18 bits of row address bus, 2 bits of logical Bank bus and 2 bits of logical Bank Group bus, where the row address bus is multiplexed with RASn, CASn, and Wen).

The Loongson 3A5000 processor can adjust the DDR4 controller parameter settings to support different memory chip types when they are specifically selected for use. The maximum supported chip selection (CS_n) is 8, the number of logical RANKS (Chip ID) is 8, the number of row addresses (ROW) is 18, the number of column addresses (COL) is 12, the number of logical body selections (BANK) is 2 (DDR4), and the number of logical body groups (BANK Group) is 2. The multiplexing relationship between CS_n and Chip ID can be matched, please see DDR4 SDRAM Parameter Configuration Format for details.

The physical address of the memory request sent by the CPU can be mapped in many different ways according to different configurations inside the controller.

The memory control circuitry integrated in the Loongson 3A5000 processor only accepts memory read/write requests from the processor or external devices, and is in the Slave State for all memory read/write operations.

The memory controller in the Loongson 3A5000 processor has the following features:

  • Fully flowing operation of commands, read and write data on the interface.

  • Memory command merging and sequencing to improve overall bandwidth.

  • Configuration register read and write ports, which can modify the basic parameters of memory devices.

  • Built-in dynamic delay compensation circuit (DCC) for reliable sending and receiving of data.

  • ECC function can detect 1-bit and 2-bit errors on the data path and can automatically correct 1-bit errors.

  • DDR3/4 SDRAM support and parameter configuration supports x4, x8, and x16 particles.

  • Controller to PHY frequency ratio of 1/2.

  • Support data transport rate range from 800Mbps to 3200Mbps.

13.2. DDR4 SDRAM Parameter Configuration Format

13.2.1. Parameter List of the Memory Controller

Table 89. Software-visible parameter list of the memory controller
Offset 63:55 55:48 47:40 39:32 31:24 23:16 15:8 7:0

PHY

0x0000

version(RD)

0x0008

x4_mode

ddr3_mode

capability(RD)

0x0010

dram_init(RD)

init_start

0x0018

0x0020

preamble2

rdfifo_valid

0x0028

rdfifo_empty(RD)

Overflow(RD)

0x0030

dll_value(RD)

dll_init_done(RD)

dll_lock_mode

dll_bypass

dll_adjj_cnt

dll_increment

dll_start_point

0x0038

dll_dbl_fix

dll_close_disable

dll_ck

0x0040

dbl_ctrl_ckca

dll_dbl_ckca

0x0048

pll_ctrl_ckca

pll_lock_ckca(RD)

dll_lock_ckca(RD)

clken_ckca

clksel_ckca

0x0050

dbl_ctrl_ds_0

dll_dbl_ds_0

0x0058

pll_ctrl_ds_0

pll_lock_ds_0(RD)

dll_lock_ds_0(RD)

clken_ds_0

clksel_ds_0

0x0060

dbl_ctrl_ds_1

dll_dbl_ds_1

0x0068

pll_ctrl_ds_1

pll_lock_ds_1(RD)

dll_lock_ds_1(RD)

clken_ds_1

clksel_ds_1

0x0070

dbl_ctrl_ds_2

dll_dbl_ds_2

0x0078

pll_ctrl_ds_2

pll_lock_ds_2(RD)

dll_lock_ds_2(RD)

clken_ds_2

clksel_ds_2

0x0080

dbl_ctrl_ds_3

dll_dbl_ds_3

0x0088

pll_ctrl_ds_3

pll_lock_ds_3(RD)

dll_lock_ds_3(RD)

clken_ds_3

clksel_ds_3

0x0090

dbl_ctrl_ds_4

dll_dbl_ds_4

0x0098

pll_ctrl_ds_4

pll_lock_ds_4(RD)

dll_lock_ds_4(RD)

clken_ds_4

clksel_ds_4

0x00a0

dbl_ctrl_ds_5

dll_dbl_ds_5

0x00a8

pll_ctrl_ds_5

pll_lock_ds_5(RD)

dll_lock_ds_5(RD)

clken_ds_5

clksel_ds_5

0x00b0

dbl_ctrl_ds_6

dll_dbl_ds_6

0x00b8

pll_ctrl_ds_6

pll_lock_ds_6(RD)

dll_lock_ds_6(RD)

clken_ds_6

clksel_ds_6

0x00c0

dbl_ctrl_ds_7

dll_dbl_ds_7

0x00c8

pll_ctrl_ds_7

pll_lock_ds_7(RD)

dll_lock_ds_7(RD)

clken_ds_7

clksel_ds_7

0x00d0

dbl_ctrl_ds_8

dll_dbl_ds_8

0x00d8

pll_ctrl_ds_8

pll_lock_ds_8(RD)

dll_lock_ds_8(RD)

clken_ds_8

clksel_ds_8

0x00e0

vrefclk_inv

vref_sample

vref_num

vref_dly

dll_vref

…​…​

0x0100

dll_1xdly_0

dll_1xgen_0

dll_wrdqs_0

dll_wrdq_0

0x0108

dll_gate_0

dll_rddqs1_0

dll_rddqs0_0

0x0110

rdodt_ctrl_0

rdgate_len_0

rdgate_mode_0

rdgate_ctrl_0

dqs_oe_ctrl_0

dq_oe_ctrl_0

0x0118

dly_2x_0

redge_sel_0

rddqs_phase_0(RD)

0x0120

w_bdly0_0[31:28]

w_bdly0_0[27:24]

w_bdly0_0[23:20]

w_bdly0_0[19:16]

w_bdly0_0[15:12]

w_bdly0_0[11:8]

w_bdly0_0[7:4]

w_bdly0_0[3:0]

0x0128

w_bdly0_0[59:56]

w_bdly0_0[55:52]

w_bdly0_0[51:48]

w_bdly0_0[47:44]

w_bdly0_0[43:40]

w_bdly0_0[39:36]

w_bdly0_0[35:32]

0x0130

w_bdly1_0[24:21]

w_bdly1_0[20:18]

w_bdly1_0[17:15]

w_bdly1_0[14:12]

w_bdly1_0[11:9]

w_bdly1_0[8:6]

w_bdly1_0[5:3]

w_bdly1_0[2:0]

0x0138

w_bdly1_0[27:26]

0x0140

rg_bdly_0[7:4]

rg_bdly_0[3:0]

0x0148

0x0150

rdqsp_bdly_0[31:28]

rdqsp_bdly_0[27:24]

rdqsp_bdly_0[23:20]

rdqsp_bdly_0[19:16]

rdqsp_bdly_0[15:12]

rdqsp_bdly_0[11:8]

rdqsp_bdly_0[7:4]

rdqsp_bdly_0[3:0]

0x0158

rdqsp_bdly_0[35:32]

0x0160

rdqsn_bdly_0[31:28]

rdqsn_bdly_0[27:24]

rdqsn_bdly_0[23:20]

rdqsn_bdly_0[19:16]

rdqsn_bdly_0[15:12]

rdqsn_bdly_0[11:8]

rdqsn_bdly_0[7:4]

rdqsn_bdly_0[3:0]

0x0168

rdqsn_bdly_0[35:32]

0x0170

rdq_bdly_0[24:21]

rdq_bdly_0[20:18]

rdq_bdly_0[17:15]

rdq_bdly_0[14:12]

rdq_bdly_0[11:9]

rdq_bdly_0[8:6]

rdq_bdly_0[5:3]

rdq_bdly_0[2:0]

0x0178

rdq_bdly_0[27:26]

0x0180

dll_1xdly_1

dll_1xgen_1

dll_wrdqs_1

dll_wrdq_1

0x0188

dll_gate_1

dll_rddqs1_1

dll_rddqs0_1

0x0190

rdodt_ctrl_1

rdgate_len_1

rdgate_mode_1

rdgate_ctrl_1

dqs_oe_ctrl_1

dq_oe_ctrl_1

0x0198

dly_2x_1

redge_sel_1

rddqs_phase_1(RD)

0x01a0

w_bdly0_1[31:28]

w_bdly0_1[27:24]

w_bdly0_1[23:20]

w_bdly0_1[19:16]

w_bdly0_1[15:12]

w_bdly0_1[11:8]

w_bdly0_1[7:4]

w_bdly0_1[3:0]

0x01a8

w_bdly0_1[59:56]

w_bdly0_1[55:52]

w_bdly0_1[51:48]

w_bdly0_1[47:44]

w_bdly0_1[43:40]

w_bdly0_1[39:36]

w_bdly0_1[35:32]

0x01b0

w_bdly1_1[24:21]

w_bdly1_1[20:18]

w_bdly1_1[17:15]

w_bdly1_1[14:12]

w_bdly1_1[11:9]

w_bdly1_1[8:6]

w_bdly1_1[5:3]

w_bdly1_1[2:0]

0x01b8

w_bdly1_1[27:26]

0x01c0

rg_bdly_1[7:4]

rg_bdly_1[3:0]

0x01c8

0x01d0

rdqsp_bdly_1[31:28]

rdqsp_bdly_1[27:24]

rdqsp_bdly_1[23:20]

rdqsp_bdly_1[19:16]

rdqsp_bdly_1[15:12]

rdqsp_bdly_1[11:8]

rdqsp_bdly_1[7:4]

rdqsp_bdly_1[3:0]

0x01d8

rdqsp_bdly_1[35:32]

0x01e0

rdqsn_bdly_1[31:28]

rdqsn_bdly_1[27:24]

rdqsn_bdly_1[23:20]

rdqsn_bdly_1[19:16]

rdqsn_bdly_1[15:12]

rdqsn_bdly_1[11:8]

rdqsn_bdly_1[7:4]

rdqsn_bdly_1[3:0]

0x01e8

rdqsn_bdly_1[35:32]

0x01f0

rdq_bdly_1[24:21]

rdq_bdly_1[20:18]

rdq_bdly_1[17:15]

rdq_bdly_1[14:12]

rdq_bdly_1[11:9]

rdq_bdly_1[8:6]

rdq_bdly_1[5:3]

rdq_bdly_1[2:0]

0x01f8

rdq_bdly_1[27:26]

0x0200

dll_1xdly_2

dll_1xgen_2

dll_wrdqs_2

dll_wrdq_2

0x0208

dll_gate_2

dll_rddqs1_2

dll_rddqs0_2

0x0210

rdodt_ctrl_2

rdgate_len_2

rdgate_mode_2

rdgate_ctrl_2

dqs_oe_ctrl_2

dq_oe_ctrl_2

0x0218

dly_2x_2

redge_sel_2

rddqs_phase_2(RD)

0x0220

w_bdly0_2[31:28]

w_bdly0_2[27:24]

w_bdly0_2[23:20]

w_bdly0_2[19:16]

w_bdly0_2[15:12]

w_bdly0_2[11:8]

w_bdly0_2[7:4]

w_bdly0_2[3:0]

0x0228

w_bdly0_2[59:56]

w_bdly0_2[55:52]

w_bdly0_2[51:48]

w_bdly0_2[47:44]

w_bdly0_2[43:40]

w_bdly0_2[39:36]

w_bdly0_2[35:32]

0x0230

w_bdly1_2[24:21]

w_bdly1_2[20:18]

w_bdly1_2[17:15]

w_bdly1_2[14:12]

w_bdly1_2[11:9]

w_bdly1_2[8:6]

w_bdly1_2[5:3]

w_bdly1_2[2:0]

0x0238

w_bdly1_2[27:26]

0x0240

rg_bdly_2[7:4]

rg_bdly_2[3:0]

0x0248

0x0250

rdqsp_bdly_2[31:28]

rdqsp_bdly_2[27:24]

rdqsp_bdly_2[23:20]

rdqsp_bdly_2[19:16]

rdqsp_bdly_2[15:12]

rdqsp_bdly_2[11:8]

rdqsp_bdly_2[7:4]

rdqsp_bdly_2[3:0]

0x0258

rdqsp_bdly_2[35:32]

0x0260

rdqsn_bdly_2[31:28]

rdqsn_bdly_2[27:24]

rdqsn_bdly_2[23:20]

rdqsn_bdly_2[19:16]

rdqsn_bdly_2[15:12]

rdqsn_bdly_2[11:8]

rdqsn_bdly_2[7:4]

rdqsn_bdly_2[3:0]

0x0268

rdqsn_bdly_2[35:32]

0x0270

rdq_bdly_2[24:21]

rdq_bdly_2[20:18]

rdq_bdly_2[17:15]

rdq_bdly_2[14:12]

rdq_bdly_2[11:9]

rdq_bdly_2[8:6]

rdq_bdly_2[5:3]

rdq_bdly_2[2:0]

0x0278

rdq_bdly_2[27:26]

0x0280

dll_1xdly_3

dll_1xgen_3

dll_wrdqs_3

dll_wrdq_3

0x0288

dll_gate_3

dll_rddqs1_3

dll_rddqs0_3

0x0290

rdodt_ctrl_3

rdgate_len_3

rdgate_mode_3

rdgate_ctrl_3

dqs_oe_ctrl_3

dq_oe_ctrl_3

0x0298

dly_2x_3

redge_sel_3

rddqs_phase_3(RD)

0x02a0

w_bdly0_3[31:28]

w_bdly0_3[27:24]

w_bdly0_3[23:20]

w_bdly0_3[19:16]

w_bdly0_3[15:12]

w_bdly0_3[11:8]

w_bdly0_3[7:4]

w_bdly0_3[3:0]

0x02a8

w_bdly0_3[59:56]

w_bdly0_3[55:52]

w_bdly0_3[51:48]

w_bdly0_3[47:44]

w_bdly0_3[43:40]

w_bdly0_3[39:36]

w_bdly0_3[35:32]

0x02b0

w_bdly1_3[24:21]

w_bdly1_3[20:18]

w_bdly1_3[17:15]

w_bdly1_3[14:12]

w_bdly1_3[11:9]

w_bdly1_3[8:6]

w_bdly1_3[5:3]

w_bdly1_3[2:0]

0x02b8

w_bdly1_3[27:26]

0x02c0

rg_bdly_3[7:4]

rg_bdly_3[3:0]

0x02c8

0x02d0

rdqsp_bdly_3[31:28]

rdqsp_bdly_3[27:24]

rdqsp_bdly_3[23:20]

rdqsp_bdly_3[19:16]

rdqsp_bdly_3[15:12]

rdqsp_bdly_3[11:8]

rdqsp_bdly_3[7:4]

rdqsp_bdly_3[3:0]

0x02d8

rdqsp_bdly_3[35:32]

0x02e0

rdqsn_bdly_3[31:28]

rdqsn_bdly_3[27:24]

rdqsn_bdly_3[23:20]

rdqsn_bdly_3[19:16]

rdqsn_bdly_3[15:12]

rdqsn_bdly_3[11:8]

rdqsn_bdly_3[7:4]

rdqsn_bdly_3[3:0]

0x02e8

rdqsn_bdly_3[35:32]

0x02f0

rdq_bdly_3[24:21]

rdq_bdly_3[20:18]

rdq_bdly_3[17:15]

rdq_bdly_3[14:12]

rdq_bdly_3[11:9]

rdq_bdly_3[8:6]

rdq_bdly_3[5:3]

rdq_bdly_3[2:0]

0x02f8

rdq_bdly_3[27:26]

0x0300

dll_1xdly_4

dll_1xgen_4

dll_wrdqs_4

dll_wrdq_4

0x0308

dll_gate_4

dll_rddqs1_4

dll_rddqs0_4

0x0310

rdodt_ctrl_4

rdgate_len_4

rdgate_mode_4

rdgate_ctrl_4

dqs_oe_ctrl_4

dq_oe_ctrl_4

0x0318

dly_2x_4

redge_sel_4

rddqs_phase_4(RD)

0x0320

w_bdly0_4[31:28]

w_bdly0_4[27:24]

w_bdly0_4[23:20]

w_bdly0_4[19:16]

w_bdly0_4[15:12]

w_bdly0_4[11:8]

w_bdly0_4[7:4]

w_bdly0_4[3:0]

0x0328

w_bdly0_4[59:56]

w_bdly0_4[55:52]

w_bdly0_4[51:48]

w_bdly0_4[47:44]

w_bdly0_4[43:40]

w_bdly0_4[39:36]

w_bdly0_4[35:32]

0x0330

w_bdly1_4[24:21]

w_bdly1_4[20:18]

w_bdly1_4[17:15]

w_bdly1_4[14:12]

w_bdly1_4[11:9]

w_bdly1_4[8:6]

w_bdly1_4[5:3]

w_bdly1_4[2:0]

0x0338

w_bdly1_4[27:26]

0x0340

rg_bdly_4[7:4]

rg_bdly_4[3:0]

0x0348

0x0350

rdqsp_bdly_4[31:28]

rdqsp_bdly_4[27:24]

rdqsp_bdly_4[23:20]

rdqsp_bdly_4[19:16]

rdqsp_bdly_4[15:12]

rdqsp_bdly_4[11:8]

rdqsp_bdly_4[7:4]

rdqsp_bdly_4[3:0]

0x0358

rdqsp_bdly_4[35:32]

0x0360

rdqsn_bdly_4[31:28]

rdqsn_bdly_4[27:24]

rdqsn_bdly_4[23:20]

rdqsn_bdly_4[19:16]

rdqsn_bdly_4[15:12]

rdqsn_bdly_4[11:8]

rdqsn_bdly_4[7:4]

rdqsn_bdly_4[3:0]

0x0368

rdqsn_bdly_4[35:32]

0x0370

rdq_bdly_4[24:21]

rdq_bdly_4[20:18]

rdq_bdly_4[17:15]

rdq_bdly_4[14:12]

rdq_bdly_4[11:9]

rdq_bdly_4[8:6]

rdq_bdly_4[5:3]

rdq_bdly_4[2:0]

0x0378

rdq_bdly_4[27:26]

0x0380

dll_1xdly_5

dll_1xgen_5

dll_wrdqs_5

dll_wrdq_5

0x0388

dll_gate_5

dll_rddqs1_5

dll_rddqs0_5

0x0390

rdodt_ctrl_5

rdgate_len_5

rdgate_mode_5

rdgate_ctrl_5

dqs_oe_ctrl_5

dq_oe_ctrl_5

0x0398

dly_2x_5

redge_sel_5

rddqs_phase_5(RD)

0x03a0

w_bdly0_5[31:28]

w_bdly0_5[27:24]

w_bdly0_5[23:20]

w_bdly0_5[19:16]

w_bdly0_5[15:12]

w_bdly0_5[11:8]

w_bdly0_5[7:4]

w_bdly0_5[3:0]

0x03a8

w_bdly0_5[59:56]

w_bdly0_5[55:52]

w_bdly0_5[51:48]

w_bdly0_5[47:44]

w_bdly0_5[43:40]

w_bdly0_5[39:36]

w_bdly0_5[35:32]

0x03b0

w_bdly1_5[24:21]

w_bdly1_5[20:18]

w_bdly1_5[17:15]

w_bdly1_5[14:12]

w_bdly1_5[11:9]

w_bdly1_5[8:6]

w_bdly1_5[5:3]

w_bdly1_5[2:0]

0x03b8

w_bdly1_5[27:26]

0x03c0

rg_bdly_5[7:4]

rg_bdly_5[3:0]

0x03c8

0x03d0

rdqsp_bdly_5[31:28]

rdqsp_bdly_5[27:24]

rdqsp_bdly_5[23:20]

rdqsp_bdly_5[19:16]

rdqsp_bdly_5[15:12]

rdqsp_bdly_5[11:8]

rdqsp_bdly_5[7:4]

rdqsp_bdly_5[3:0]

0x03d8

rdqsp_bdly_5[35:32]

0x03e0

rdqsn_bdly_5[31:28]

rdqsn_bdly_5[27:24]

rdqsn_bdly_5[23:20]

rdqsn_bdly_5[19:16]

rdqsn_bdly_5[15:12]

rdqsn_bdly_5[11:8]

rdqsn_bdly_5[7:4]

rdqsn_bdly_5[3:0]

0x03e8

rdqsn_bdly_5[35:32]

0x03f0

rdq_bdly_5[24:21]

rdq_bdly_5[20:18]

rdq_bdly_5[17:15]

rdq_bdly_5[14:12]

rdq_bdly_5[11:9]

rdq_bdly_5[8:6]

rdq_bdly_5[5:3]

rdq_bdly_5[2:0]

0x03f8

rdq_bdly_5[27:26]

0x0400

dll_1xdly_6

dll_1xgen_6

dll_wrdqs_6

dll_wrdq_6

0x0408

dll_gate_6