Merge 4.4.145 into android-4.4
Changes in 4.4.145
MIPS: ath79: fix register address in ath79_ddr_wb_flush()
ip: hash fragments consistently
net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
rtnetlink: add rtnl_link_state check in rtnl_configure_link
tcp: fix dctcp delayed ACK schedule
tcp: helpers to send special DCTCP ack
tcp: do not cancel delay-AcK on DCTCP special ACK
tcp: do not delay ACK in DCTCP upon CE status change
tcp: avoid collapses in tcp_prune_queue() if possible
tcp: detect malicious patterns in tcp_collapse_ofo_queue()
ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull
usb: cdc_acm: Add quirk for Castles VEGA3000
usb: core: handle hub C_PORT_OVER_CURRENT condition
usb: gadget: f_fs: Only return delayed status when len is 0
driver core: Partially revert "driver core: correct device's shutdown order"
can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK
can: xilinx_can: fix recovery from error states not being propagated
can: xilinx_can: fix device dropping off bus on RX overrun
can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
can: xilinx_can: fix incorrect clear of non-processed interrupts
can: xilinx_can: fix RX overflow interrupt not being enabled
turn off -Wattribute-alias
ARM: fix put_user() for gcc-8
Linux 4.4.145
Change-Id: I449c110f7f186f2c72c9cc45e00a8deda0d54e40
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
diff --git a/.gitignore b/.gitignore
index fd3a355..b17dcd2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,6 +33,7 @@
*.lzo
*.patch
*.gcno
+*.ll
modules.builtin
Module.symvers
*.dwo
@@ -112,3 +113,6 @@
# Kdevelop4
*.kdev4
+
+# fetched Android config fragments
+android/configs/android-*.cfg
diff --git a/.mailmap b/.mailmap
index b1e9a97..8cbe2bd 100644
--- a/.mailmap
+++ b/.mailmap
@@ -14,6 +14,7 @@
Alan Cox <alan@lxorguk.ukuu.org.uk>
Alan Cox <root@hraefn.swansea.linux.org.uk>
Aleksey Gorelov <aleksey_gorelov@phoenix.com>
+Aleksandar Markovic <aleksandar.markovic@mips.com> <aleksandar.markovic@imgtec.com>
Al Viro <viro@ftp.linux.org.uk>
Al Viro <viro@zenIV.linux.org.uk>
Andreas Herrmann <aherrman@de.ibm.com>
@@ -85,6 +86,7 @@
Mayuresh Janorkar <mayur@ti.com>
Michael Buesch <m@bues.ch>
Michel Dänzer <michel@tungstengraphics.com>
+Miodrag Dinic <miodrag.dinic@mips.com> <miodrag.dinic@imgtec.com>
Mitesh shah <mshah@teja.com>
Mohit Kumar <mohit.kumar@st.com> <mohit.kumar.dhaka@gmail.com>
Morten Welinder <terra@gnome.org>
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index cd077ca..bd3f803 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -435,6 +435,8 @@
- info on the magic SysRq key.
target/
- directory with info on generating TCM v4 fabric .ko modules
+tee.txt
+ - info on the TEE subsystem and drivers
this_cpu_ops.txt
- List rationale behind and the way to use this_cpu operations.
thermal/
diff --git a/Documentation/ABI/testing/sysfs-class-dual-role-usb b/Documentation/ABI/testing/sysfs-class-dual-role-usb
new file mode 100644
index 0000000..a900fd7
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-dual-role-usb
@@ -0,0 +1,71 @@
+What: /sys/class/dual_role_usb/.../
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ Provide a generic interface to monitor and change
+ the state of dual role usb ports. The name here
+ refers to the name mentioned in the
+ dual_role_phy_desc that is passed while registering
+ the dual_role_phy_intstance through
+ devm_dual_role_instance_register.
+
+What: /sys/class/dual_role_usb/.../supported_modes
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ This is a static node, once initialized this
+ is not expected to change during runtime. "dfp"
+ refers to "downstream facing port" i.e. port can
+ only act as host. "ufp" refers to "upstream
+ facing port" i.e. port can only act as device.
+ "dfp ufp" refers to "dual role port" i.e. the port
+ can either be a host port or a device port.
+
+What: /sys/class/dual_role_usb/.../mode
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ The mode node refers to the current mode in which the
+ port is operating. "dfp" for host ports. "ufp" for device
+ ports and "none" when cable is not connected.
+
+ On devices where the USB mode is software-controllable,
+ userspace can change the mode by writing "dfp" or "ufp".
+ On devices where the USB mode is fixed in hardware,
+ this attribute is read-only.
+
+What: /sys/class/dual_role_usb/.../power_role
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ The power_role node mentions whether the port
+ is "sink"ing or "source"ing power. "none" if
+ they are not connected.
+
+ On devices implementing USB Power Delivery,
+ userspace can control the power role by writing "sink" or
+ "source". On devices without USB-PD, this attribute is
+ read-only.
+
+What: /sys/class/dual_role_usb/.../data_role
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ The data_role node mentions whether the port
+ is acting as "host" or "device" for USB data connection.
+ "none" if there is no active data link.
+
+ On devices implementing USB Power Delivery, userspace
+ can control the data role by writing "host" or "device".
+ On devices without USB-PD, this attribute is read-only
+
+What: /sys/class/dual_role_usb/.../powers_vconn
+Date: June 2015
+Contact: Badhri Jagan Sridharan<badhri@google.com>
+Description:
+ The powers_vconn node mentions whether the port
+ is supplying power for VCONN pin.
+
+ On devices with software control of VCONN,
+ userspace can disable the power supply to VCONN by writing "n",
+ or enable the power supply by writing "y".
diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index 0345f2d..f82da9b 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -51,12 +51,33 @@
Controls the dirty page count condition for the in-place-update
policies.
+What: /sys/fs/f2fs/<disk>/min_hot_blocks
+Date: March 2017
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Controls the dirty page count condition for redefining hot data.
+
+What: /sys/fs/f2fs/<disk>/min_ssr_sections
+Date: October 2017
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Controls the fee section threshold to trigger SSR allocation.
+
What: /sys/fs/f2fs/<disk>/max_small_discards
Date: November 2013
Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
Description:
Controls the issue rate of small discard commands.
+What: /sys/fs/f2fs/<disk>/discard_granularity
+Date: July 2017
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Controls discard granularity of inner discard thread, inner thread
+ will not issue discards with size that is smaller than granularity.
+ The unit size is one block, now only support configuring in range
+ of [1, 512].
+
What: /sys/fs/f2fs/<disk>/max_victim_search
Date: January 2014
Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
@@ -80,6 +101,7 @@
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description:
Controls the trimming rate in batch mode.
+ <deprecated>
What: /sys/fs/f2fs/<disk>/cp_interval
Date: October 2015
@@ -87,8 +109,98 @@
Description:
Controls the checkpoint timing.
+What: /sys/fs/f2fs/<disk>/idle_interval
+Date: January 2016
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Controls the idle timing.
+
+What: /sys/fs/f2fs/<disk>/iostat_enable
+Date: August 2017
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Controls to enable/disable IO stat.
+
What: /sys/fs/f2fs/<disk>/ra_nid_pages
Date: October 2015
Contact: "Chao Yu" <chao2.yu@samsung.com>
Description:
Controls the count of nid pages to be readaheaded.
+
+What: /sys/fs/f2fs/<disk>/dirty_nats_ratio
+Date: January 2016
+Contact: "Chao Yu" <chao2.yu@samsung.com>
+Description:
+ Controls dirty nat entries ratio threshold, if current
+ ratio exceeds configured threshold, checkpoint will
+ be triggered for flushing dirty nat entries.
+
+What: /sys/fs/f2fs/<disk>/lifetime_write_kbytes
+Date: January 2016
+Contact: "Shuoran Liu" <liushuoran@huawei.com>
+Description:
+ Shows total written kbytes issued to disk.
+
+What: /sys/fs/f2fs/<disk>/feature
+Date: July 2017
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Shows all enabled features in current device.
+
+What: /sys/fs/f2fs/<disk>/inject_rate
+Date: May 2016
+Contact: "Sheng Yong" <shengyong1@huawei.com>
+Description:
+ Controls the injection rate.
+
+What: /sys/fs/f2fs/<disk>/inject_type
+Date: May 2016
+Contact: "Sheng Yong" <shengyong1@huawei.com>
+Description:
+ Controls the injection type.
+
+What: /sys/fs/f2fs/<disk>/reserved_blocks
+Date: June 2017
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Controls target reserved blocks in system, the threshold
+ is soft, it could exceed current available user space.
+
+What: /sys/fs/f2fs/<disk>/current_reserved_blocks
+Date: October 2017
+Contact: "Yunlong Song" <yunlong.song@huawei.com>
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Shows current reserved blocks in system, it may be temporarily
+ smaller than target_reserved_blocks, but will gradually
+ increase to target_reserved_blocks when more free blocks are
+ freed by user later.
+
+What: /sys/fs/f2fs/<disk>/gc_urgent
+Date: August 2017
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Do background GC agressively
+
+What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time
+Date: August 2017
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Controls sleep time of GC urgent mode
+
+What: /sys/fs/f2fs/<disk>/readdir_ra
+Date: November 2017
+Contact: "Sheng Yong" <shengyong1@huawei.com>
+Description:
+ Controls readahead inode block in readdir.
+
+What: /sys/fs/f2fs/<disk>/extension_list
+Date: Feburary 2018
+Contact: "Chao Yu" <yuchao0@huawei.com>
+Description:
+ Used to control configure extension list:
+ - Query: cat /sys/fs/f2fs/<disk>/extension_list
+ - Add: echo '[h/c]extension' > /sys/fs/f2fs/<disk>/extension_list
+ - Del: echo '[h/c]!extension' > /sys/fs/f2fs/<disk>/extension_list
+ - [h] means add/del hot file extension
+ - [c] means add/del cold file extension
diff --git a/Documentation/ABI/testing/sysfs-kernel-wakeup_reasons b/Documentation/ABI/testing/sysfs-kernel-wakeup_reasons
new file mode 100644
index 0000000..acb19b9
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-wakeup_reasons
@@ -0,0 +1,16 @@
+What: /sys/kernel/wakeup_reasons/last_resume_reason
+Date: February 2014
+Contact: Ruchi Kandoi <kandoiruchi@google.com>
+Description:
+ The /sys/kernel/wakeup_reasons/last_resume_reason is
+ used to report wakeup reasons after system exited suspend.
+
+What: /sys/kernel/wakeup_reasons/last_suspend_time
+Date: March 2015
+Contact: jinqian <jinqian@google.com>
+Description:
+ The /sys/kernel/wakeup_reasons/last_suspend_time is
+ used to report time spent in last suspend cycle. It contains
+ two numbers (in seconds) separated by space. First number is
+ the time spent in suspend and resume processes. Second number
+ is the time spent in sleep state.
\ No newline at end of file
diff --git a/Documentation/android.txt b/Documentation/android.txt
new file mode 100644
index 0000000..0f40a78
--- /dev/null
+++ b/Documentation/android.txt
@@ -0,0 +1,121 @@
+ =============
+ A N D R O I D
+ =============
+
+Copyright (C) 2009 Google, Inc.
+Written by Mike Chan <mike@android.com>
+
+CONTENTS:
+---------
+
+1. Android
+ 1.1 Required enabled config options
+ 1.2 Required disabled config options
+ 1.3 Recommended enabled config options
+2. Contact
+
+
+1. Android
+==========
+
+Android (www.android.com) is an open source operating system for mobile devices.
+This document describes configurations needed to run the Android framework on
+top of the Linux kernel.
+
+To see a working defconfig look at msm_defconfig or goldfish_defconfig
+which can be found at http://android.git.kernel.org in kernel/common.git
+and kernel/msm.git
+
+
+1.1 Required enabled config options
+-----------------------------------
+After building a standard defconfig, ensure that these options are enabled in
+your .config or defconfig if they are not already. Based off the msm_defconfig.
+You should keep the rest of the default options enabled in the defconfig
+unless you know what you are doing.
+
+ANDROID_PARANOID_NETWORK
+ASHMEM
+CONFIG_FB_MODE_HELPERS
+CONFIG_FONT_8x16
+CONFIG_FONT_8x8
+CONFIG_YAFFS_SHORT_NAMES_IN_RAM
+DAB
+EARLYSUSPEND
+FB
+FB_CFB_COPYAREA
+FB_CFB_FILLRECT
+FB_CFB_IMAGEBLIT
+FB_DEFERRED_IO
+FB_TILEBLITTING
+HIGH_RES_TIMERS
+INOTIFY
+INOTIFY_USER
+INPUT_EVDEV
+INPUT_GPIO
+INPUT_MISC
+LEDS_CLASS
+LEDS_GPIO
+LOCK_KERNEL
+LkOGGER
+LOW_MEMORY_KILLER
+MISC_DEVICES
+NEW_LEDS
+NO_HZ
+POWER_SUPPLY
+PREEMPT
+RAMFS
+RTC_CLASS
+RTC_LIB
+SWITCH
+SWITCH_GPIO
+TMPFS
+UID_STAT
+UID16
+USB_FUNCTION
+USB_FUNCTION_ADB
+USER_WAKELOCK
+VIDEO_OUTPUT_CONTROL
+WAKELOCK
+YAFFS_AUTO_YAFFS2
+YAFFS_FS
+YAFFS_YAFFS1
+YAFFS_YAFFS2
+
+
+1.2 Required disabled config options
+------------------------------------
+CONFIG_YAFFS_DISABLE_LAZY_LOAD
+DNOTIFY
+
+
+1.3 Recommended enabled config options
+------------------------------
+ANDROID_PMEM
+PSTORE_CONSOLE
+PSTORE_RAM
+SCHEDSTATS
+DEBUG_PREEMPT
+DEBUG_MUTEXES
+DEBUG_SPINLOCK_SLEEP
+DEBUG_INFO
+FRAME_POINTER
+CPU_FREQ
+CPU_FREQ_TABLE
+CPU_FREQ_DEFAULT_GOV_ONDEMAND
+CPU_FREQ_GOV_ONDEMAND
+CRC_CCITT
+EMBEDDED
+INPUT_TOUCHSCREEN
+I2C
+I2C_BOARDINFO
+LOG_BUF_SHIFT=17
+SERIAL_CORE
+SERIAL_CORE_CONSOLE
+
+
+2. Contact
+==========
+website: http://android.git.kernel.org
+
+mailing-lists: android-kernel@googlegroups.com
diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 701d39d..56d6d8b7 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -109,7 +109,13 @@
1 - 4K
2 - 16K
3 - 64K
- Bits 3-63: Reserved.
+ Bit 3: Kernel physical placement
+ 0 - 2MB aligned base should be as close as possible
+ to the base of DRAM, since memory below it is not
+ accessible via the linear mapping
+ 1 - 2MB aligned base may be anywhere in physical
+ memory
+ Bits 4-63: Reserved.
- When image_size is zero, a bootloader should attempt to keep as much
memory as possible free for use by the kernel immediately after the
@@ -117,14 +123,14 @@
depending on selected features, and is effectively unbound.
The Image must be placed text_offset bytes from a 2MB aligned base
-address near the start of usable system RAM and called there. Memory
-below that base address is currently unusable by Linux, and therefore it
-is strongly recommended that this location is the start of system RAM.
-The region between the 2 MB aligned base address and the start of the
-image has no special significance to the kernel, and may be used for
-other purposes.
+address anywhere in usable system RAM and called there. The region
+between the 2 MB aligned base address and the start of the image has no
+special significance to the kernel, and may be used for other purposes.
At least image_size bytes from the start of the image must be free for
use by the kernel.
+NOTE: versions prior to v4.6 cannot make use of memory below the
+physical offset of the Image so it is recommended that the Image be
+placed as close as possible to the start of system RAM.
Any memory described to the kernel (even that below the start of the
image) which is not marked as reserved from the kernel (e.g., with a
diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
new file mode 100644
index 0000000..ba4b6ac
--- /dev/null
+++ b/Documentation/arm64/silicon-errata.txt
@@ -0,0 +1,59 @@
+ Silicon Errata and Software Workarounds
+ =======================================
+
+Author: Will Deacon <will.deacon@arm.com>
+Date : 27 November 2015
+
+It is an unfortunate fact of life that hardware is often produced with
+so-called "errata", which can cause it to deviate from the architecture
+under specific circumstances. For hardware produced by ARM, these
+errata are broadly classified into the following categories:
+
+ Category A: A critical error without a viable workaround.
+ Category B: A significant or critical error with an acceptable
+ workaround.
+ Category C: A minor error that is not expected to occur under normal
+ operation.
+
+For more information, consult one of the "Software Developers Errata
+Notice" documents available on infocenter.arm.com (registration
+required).
+
+As far as Linux is concerned, Category B errata may require some special
+treatment in the operating system. For example, avoiding a particular
+sequence of code, or configuring the processor in a particular way. A
+less common situation may require similar actions in order to declassify
+a Category A erratum into a Category C erratum. These are collectively
+known as "software workarounds" and are only required in the minority of
+cases (e.g. those cases that both require a non-secure workaround *and*
+can be triggered by Linux).
+
+For software workarounds that may adversely impact systems unaffected by
+the erratum in question, a Kconfig entry is added under "Kernel
+Features" -> "ARM errata workarounds via the alternatives framework".
+These are enabled by default and patched in at runtime when an affected
+CPU is detected. For less-intrusive workarounds, a Kconfig option is not
+available and the code is structured (preferably with a comment) in such
+a way that the erratum will not be hit.
+
+This approach can make it slightly onerous to determine exactly which
+errata are worked around in an arbitrary kernel source tree, so this
+file acts as a registry of software workarounds in the Linux Kernel and
+will be updated when new workarounds are committed and backported to
+stable kernels.
+
+| Implementor | Component | Erratum ID | Kconfig |
++----------------+-----------------+-----------------+-------------------------+
+| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
+| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
+| ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 |
+| ARM | Cortex-A53 | #819472 | ARM64_ERRATUM_819472 |
+| ARM | Cortex-A53 | #845719 | ARM64_ERRATUM_845719 |
+| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
+| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
+| ARM | Cortex-A57 | #852523 | N/A |
+| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
+| | | | |
+| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
+| Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 |
+| Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 |
diff --git a/Documentation/block/00-INDEX b/Documentation/block/00-INDEX
index e840b47..bc51487 100644
--- a/Documentation/block/00-INDEX
+++ b/Documentation/block/00-INDEX
@@ -26,3 +26,9 @@
- Switching I/O schedulers at runtime
writeback_cache_control.txt
- Control of volatile write back caches
+mmc-max-speed.txt
+ - eMMC layer speed simulation, related to /sys/block/mmcblk*/
+ attributes:
+ max_read_speed
+ max_write_speed
+ cache_size
diff --git a/Documentation/block/mmc-max-speed.txt b/Documentation/block/mmc-max-speed.txt
new file mode 100644
index 0000000..3f052b9
--- /dev/null
+++ b/Documentation/block/mmc-max-speed.txt
@@ -0,0 +1,38 @@
+eMMC Block layer simulation speed controls in /sys/block/mmcblk*/
+===============================================
+
+Turned on with CONFIG_MMC_SIMULATE_MAX_SPEED which enables MMC device speed
+limiting. Used to test and simulate the behavior of the system when
+confronted with a slow MMC.
+
+Enables max_read_speed, max_write_speed and cache_size attributes and module
+default parameters to control the write or read maximum KB/second speed
+behaviors.
+
+NB: There is room for improving the algorithm for aspects tied directly to
+eMMC specific behavior. For instance, wear leveling and stalls from an
+exhausted erase pool. We would expect that if there was a need to provide
+similar speed simulation controls to other types of block devices, aspects of
+their behavior are modelled separately (e.g. head seek times, heat assist,
+shingling and rotational latency).
+
+/sys/block/mmcblk0/max_read_speed:
+
+Number of KB/second reads allowed to the block device. Used to test and
+simulate the behavior of the system when confronted with a slow reading MMC.
+Set to 0 or "off" to place no speed limit.
+
+/sys/block/mmcblk0/max_write_speed:
+
+Number of KB/second writes allowed to the block device. Used to test and
+simulate the behavior of the system when confronted with a slow writing MMC.
+Set to 0 or "off" to place no speed limit.
+
+/sys/block/mmcblk0/cache_size:
+
+Number of MB of high speed memory or high speed SLC cache expected on the
+eMMC device being simulated. Used to help simulate the write-back behavior
+more accurately. The assumption is the cache has no delay, but draws down
+in the background to the MLC/TLC primary store at the max_write_speed rate.
+Any write speed delays will show up when the cache is full, or when an I/O
+request to flush is issued.
diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt
index c15aa75..ac8a37e 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -28,6 +28,7 @@
2.3 Userspace
2.4 Ondemand
2.5 Conservative
+2.6 Interactive
3. The Governor Interface in the CPUfreq Core
@@ -218,6 +219,90 @@
speed. Load for frequency increase is still evaluated every
sampling rate.
+2.6 Interactive
+---------------
+
+The CPUfreq governor "interactive" is designed for latency-sensitive,
+interactive workloads. This governor sets the CPU speed depending on
+usage, similar to "ondemand" and "conservative" governors, but with a
+different set of configurable behaviors.
+
+The tuneable values for this governor are:
+
+target_loads: CPU load values used to adjust speed to influence the
+current CPU load toward that value. In general, the lower the target
+load, the more often the governor will raise CPU speeds to bring load
+below the target. The format is a single target load, optionally
+followed by pairs of CPU speeds and CPU loads to target at or above
+those speeds. Colons can be used between the speeds and associated
+target loads for readability. For example:
+
+ 85 1000000:90 1700000:99
+
+targets CPU load 85% below speed 1GHz, 90% at or above 1GHz, until
+1.7GHz and above, at which load 99% is targeted. If speeds are
+specified these must appear in ascending order. Higher target load
+values are typically specified for higher speeds, that is, target load
+values also usually appear in an ascending order. The default is
+target load 90% for all speeds.
+
+min_sample_time: The minimum amount of time to spend at the current
+frequency before ramping down. Default is 80000 uS.
+
+hispeed_freq: An intermediate "hi speed" at which to initially ramp
+when CPU load hits the value specified in go_hispeed_load. If load
+stays high for the amount of time specified in above_hispeed_delay,
+then speed may be bumped higher. Default is the maximum speed
+allowed by the policy at governor initialization time.
+
+go_hispeed_load: The CPU load at which to ramp to hispeed_freq.
+Default is 99%.
+
+above_hispeed_delay: When speed is at or above hispeed_freq, wait for
+this long before raising speed in response to continued high load.
+The format is a single delay value, optionally followed by pairs of
+CPU speeds and the delay to use at or above those speeds. Colons can
+be used between the speeds and associated delays for readability. For
+example:
+
+ 80000 1300000:200000 1500000:40000
+
+uses delay 80000 uS until CPU speed 1.3 GHz, at which speed delay
+200000 uS is used until speed 1.5 GHz, at which speed (and above)
+delay 40000 uS is used. If speeds are specified these must appear in
+ascending order. Default is 20000 uS.
+
+timer_rate: Sample rate for reevaluating CPU load when the CPU is not
+idle. A deferrable timer is used, such that the CPU will not be woken
+from idle to service this timer until something else needs to run.
+(The maximum time to allow deferring this timer when not running at
+minimum speed is configurable via timer_slack.) Default is 20000 uS.
+
+timer_slack: Maximum additional time to defer handling the governor
+sampling timer beyond timer_rate when running at speeds above the
+minimum. For platforms that consume additional power at idle when
+CPUs are running at speeds greater than minimum, this places an upper
+bound on how long the timer will be deferred prior to re-evaluating
+load and dropping speed. For example, if timer_rate is 20000uS and
+timer_slack is 10000uS then timers will be deferred for up to 30msec
+when not at lowest speed. A value of -1 means defer timers
+indefinitely at all speeds. Default is 80000 uS.
+
+boost: If non-zero, immediately boost speed of all CPUs to at least
+hispeed_freq until zero is written to this attribute. If zero, allow
+CPU speeds to drop below hispeed_freq according to load as usual.
+Default is zero.
+
+boostpulse: On each write, immediately boost speed of all CPUs to
+hispeed_freq for at least the period of time specified by
+boostpulse_duration, after which speeds are allowed to drop below
+hispeed_freq according to load as usual.
+
+boostpulse_duration: Length of time to hold CPU speed at hispeed_freq
+on a write to boostpulse, before allowing speed to drop according to
+load as usual. Default is 80000 uS.
+
+
3. The Governor Interface in the CPUfreq Core
=============================================
diff --git a/Documentation/device-mapper/boot.txt b/Documentation/device-mapper/boot.txt
new file mode 100644
index 0000000..adcaad5
--- /dev/null
+++ b/Documentation/device-mapper/boot.txt
@@ -0,0 +1,42 @@
+Boot time creation of mapped devices
+===================================
+
+It is possible to configure a device mapper device to act as the root
+device for your system in two ways.
+
+The first is to build an initial ramdisk which boots to a minimal
+userspace which configures the device, then pivot_root(8) in to it.
+
+For simple device mapper configurations, it is possible to boot directly
+using the following kernel command line:
+
+dm="<name> <uuid> <ro>,table line 1,...,table line n"
+
+name = the name to associate with the device
+ after boot, udev, if used, will use that name to label
+ the device node.
+uuid = may be 'none' or the UUID desired for the device.
+ro = may be "ro" or "rw". If "ro", the device and device table will be
+ marked read-only.
+
+Each table line may be as normal when using the dmsetup tool except for
+two variations:
+1. Any use of commas will be interpreted as a newline
+2. Quotation marks cannot be escaped and cannot be used without
+ terminating the dm= argument.
+
+Unless renamed by udev, the device node created will be dm-0 as the
+first minor number for the device-mapper is used during early creation.
+
+Example
+=======
+
+- Booting to a linear array made up of user-mode linux block devices:
+
+ dm="lroot none 0, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" \
+ root=/dev/dm-0
+
+Will boot to a rw dm-linear target of 8192 sectors split across two
+block devices identified by their major:minor numbers. After boot, udev
+will rename this target to /dev/mapper/lroot (depending on the rules).
+No uuid was assigned.
diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt
index e15bc1a..b3d2e4a 100644
--- a/Documentation/device-mapper/verity.txt
+++ b/Documentation/device-mapper/verity.txt
@@ -18,11 +18,11 @@
0 is the original format used in the Chromium OS.
The salt is appended when hashing, digests are stored continuously and
- the rest of the block is padded with zeros.
+ the rest of the block is padded with zeroes.
1 is the current format that should be used for new devices.
The salt is prepended when hashing and each digest is
- padded with zeros to the power of two.
+ padded with zeroes to the power of two.
<dev>
This is the device containing data, the integrity of which needs to be
@@ -79,6 +79,48 @@
not compatible with ignore_corruption and requires user space support to
avoid restart loops.
+ignore_zero_blocks
+ Do not verify blocks that are expected to contain zeroes and always return
+ zeroes instead. This may be useful if the partition contains unused blocks
+ that are not guaranteed to contain zeroes.
+
+use_fec_from_device <fec_dev>
+ Use forward error correction (FEC) to recover from corruption if hash
+ verification fails. Use encoding data from the specified device. This
+ may be the same device where data and hash blocks reside, in which case
+ fec_start must be outside data and hash areas.
+
+ If the encoding data covers additional metadata, it must be accessible
+ on the hash device after the hash blocks.
+
+ Note: block sizes for data and hash devices must match. Also, if the
+ verity <dev> is encrypted the <fec_dev> should be too.
+
+fec_roots <num>
+ Number of generator roots. This equals to the number of parity bytes in
+ the encoding data. For example, in RS(M, N) encoding, the number of roots
+ is M-N.
+
+fec_blocks <num>
+ The number of encoding data blocks on the FEC device. The block size for
+ the FEC device is <data_block_size>.
+
+fec_start <offset>
+ This is the offset, in <data_block_size> blocks, from the start of the
+ FEC device to the beginning of the encoding data.
+
+check_at_most_once
+ Verify data blocks only the first time they are read from the data device,
+ rather than every time. This reduces the overhead of dm-verity so that it
+ can be used on systems that are memory and/or CPU constrained. However, it
+ provides a reduced level of security because only offline tampering of the
+ data device's content will be detected, not online tampering.
+
+ Hash blocks are still verified each time they are read from the hash device,
+ since verification of hash blocks is less performance critical than data
+ blocks, and a hash block will not be verified any more after all the data
+ blocks it covers have been verified anyway.
+
Theory of operation
===================
@@ -98,6 +140,11 @@
into the page cache. Block hashes are stored linearly, aligned to the nearest
block size.
+If forward error correction (FEC) support is enabled any recovery of
+corrupted data will be verified using the cryptographic hash of the
+corresponding data. This is why combining error correction with
+integrity checking is essential.
+
Hash Tree
---------
diff --git a/Documentation/devicetree/bindings/arm/firmware/linaro,optee-tz.txt b/Documentation/devicetree/bindings/arm/firmware/linaro,optee-tz.txt
new file mode 100644
index 0000000..d38834c
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/firmware/linaro,optee-tz.txt
@@ -0,0 +1,31 @@
+OP-TEE Device Tree Bindings
+
+OP-TEE is a piece of software using hardware features to provide a Trusted
+Execution Environment. The security can be provided with ARM TrustZone, but
+also by virtualization or a separate chip.
+
+We're using "linaro" as the first part of the compatible property for
+the reference implementation maintained by Linaro.
+
+* OP-TEE based on ARM TrustZone required properties:
+
+- compatible : should contain "linaro,optee-tz"
+
+- method : The method of calling the OP-TEE Trusted OS. Permitted
+ values are:
+
+ "smc" : SMC #0, with the register assignments specified
+ in drivers/tee/optee/optee_smc.h
+
+ "hvc" : HVC #0, with the register assignments specified
+ in drivers/tee/optee/optee_smc.h
+
+
+
+Example:
+ firmware {
+ optee {
+ compatible = "linaro,optee-tz";
+ method = "smc";
+ };
+ };
diff --git a/Documentation/devicetree/bindings/display/google,goldfish-fb.txt b/Documentation/devicetree/bindings/display/google,goldfish-fb.txt
new file mode 100644
index 0000000..751fa9f
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/google,goldfish-fb.txt
@@ -0,0 +1,17 @@
+Android Goldfish framebuffer
+
+Android Goldfish framebuffer device used by Android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-fb"
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ display-controller@1f008000 {
+ compatible = "google,goldfish-fb";
+ interrupts = <0x10>;
+ reg = <0x1f008000 0x100>;
+ };
diff --git a/Documentation/devicetree/bindings/goldfish/audio.txt b/Documentation/devicetree/bindings/goldfish/audio.txt
new file mode 100644
index 0000000..d043fda
--- /dev/null
+++ b/Documentation/devicetree/bindings/goldfish/audio.txt
@@ -0,0 +1,17 @@
+Android Goldfish Audio
+
+Android goldfish audio device generated by android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-audio" to match emulator
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ goldfish_audio@9030000 {
+ compatible = "google,goldfish-audio";
+ reg = <0x9030000 0x100>;
+ interrupts = <0x4>;
+ };
diff --git a/Documentation/devicetree/bindings/goldfish/battery.txt b/Documentation/devicetree/bindings/goldfish/battery.txt
new file mode 100644
index 0000000..4fb6139
--- /dev/null
+++ b/Documentation/devicetree/bindings/goldfish/battery.txt
@@ -0,0 +1,17 @@
+Android Goldfish Battery
+
+Android goldfish battery device generated by android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-battery" to match emulator
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ goldfish_battery@9020000 {
+ compatible = "google,goldfish-battery";
+ reg = <0x9020000 0x1000>;
+ interrupts = <0x3>;
+ };
diff --git a/Documentation/devicetree/bindings/goldfish/events.txt b/Documentation/devicetree/bindings/goldfish/events.txt
new file mode 100644
index 0000000..5babf46
--- /dev/null
+++ b/Documentation/devicetree/bindings/goldfish/events.txt
@@ -0,0 +1,17 @@
+Android Goldfish Events Keypad
+
+Android goldfish events keypad device generated by android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-events-keypad" to match emulator
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ goldfish-events@9040000 {
+ compatible = "google,goldfish-events-keypad";
+ reg = <0x9040000 0x1000>;
+ interrupts = <0x5>;
+ };
diff --git a/Documentation/devicetree/bindings/goldfish/tty.txt b/Documentation/devicetree/bindings/goldfish/tty.txt
new file mode 100644
index 0000000..8264827
--- /dev/null
+++ b/Documentation/devicetree/bindings/goldfish/tty.txt
@@ -0,0 +1,17 @@
+Android Goldfish TTY
+
+Android goldfish tty device generated by android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-tty" to match emulator
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ goldfish_tty@1f004000 {
+ compatible = "google,goldfish-tty";
+ reg = <0x1f004000 0x1000>;
+ interrupts = <0xc>;
+ };
diff --git a/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
new file mode 100644
index 0000000..35f7527
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
@@ -0,0 +1,30 @@
+Android Goldfish PIC
+
+Android Goldfish programmable interrupt device used by Android
+emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-pic"
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example for mips when used in cascade mode:
+
+ cpuintc {
+ #interrupt-cells = <0x1>;
+ #address-cells = <0>;
+ interrupt-controller;
+ compatible = "mti,cpu-interrupt-controller";
+ };
+
+ interrupt-controller@1f000000 {
+ compatible = "google,goldfish-pic";
+ reg = <0x1f000000 0x1000>;
+
+ interrupt-controller;
+ #interrupt-cells = <0x1>;
+
+ interrupt-parent = <&cpuintc>;
+ interrupts = <0x2>;
+ };
diff --git a/Documentation/devicetree/bindings/misc/memory-state-time.txt b/Documentation/devicetree/bindings/misc/memory-state-time.txt
new file mode 100644
index 0000000..c99a506
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/memory-state-time.txt
@@ -0,0 +1,8 @@
+Memory bandwidth and frequency state tracking
+
+Required properties:
+- compatible : should be:
+ "memory-state-time"
+- freq-tbl: Should contain entries with each frequency in Hz.
+- bw-buckets: Should contain upper-bound limits for each bandwidth bucket in Mbps.
+ Must match the framework power_profile.xml for the device.
diff --git a/Documentation/devicetree/bindings/misc/ramoops.txt b/Documentation/devicetree/bindings/misc/ramoops.txt
new file mode 100644
index 0000000..5a475fa
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/ramoops.txt
@@ -0,0 +1,43 @@
+Ramoops oops/panic logger
+=========================
+
+ramoops provides persistent RAM storage for oops and panics, so they can be
+recovered after a reboot.
+
+Parts of this storage may be set aside for other persistent log buffers, such
+as kernel log messages, or for optional ECC error-correction data. The total
+size of these optional buffers must fit in the reserved region.
+
+Any remaining space will be used for a circular buffer of oops and panic
+records. These records have a configurable size, with a size of 0 indicating
+that they should be disabled.
+
+
+Required properties:
+
+- compatible: must be "ramoops"
+
+- memory-region: phandle to a region of memory that is preserved between reboots
+
+
+Optional properties:
+
+- ecc-size: enables ECC support and specifies ECC buffer size in bytes
+ (defaults to no ECC)
+
+- record-size: maximum size in bytes of each dump done on oops/panic
+ (defaults to 0)
+
+- console-size: size in bytes of log buffer reserved for kernel messages
+ (defaults to 0)
+
+- ftrace-size: size in bytes of log buffer reserved for function tracing and
+ profiling (defaults to 0)
+
+- pmsg-size: size in bytes of log buffer reserved for userspace messages
+ (defaults to 0)
+
+- unbuffered: if present, use unbuffered mappings to map the reserved region
+ (defaults to buffered mappings)
+
+- no-dump-oops: if present, only dump panics (defaults to panics and oops)
diff --git a/Documentation/devicetree/bindings/power/mti,mips-cpc.txt b/Documentation/devicetree/bindings/power/mti,mips-cpc.txt
new file mode 100644
index 0000000..c6b8251
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/mti,mips-cpc.txt
@@ -0,0 +1,8 @@
+Binding for MIPS Cluster Power Controller (CPC).
+
+This binding allows a system to specify where the CPC registers are
+located.
+
+Required properties:
+compatible : Should be "mti,mips-cpc".
+regs: Should describe the address & size of the CPC register region.
diff --git a/Documentation/devicetree/bindings/rtc/google,goldfish-rtc.txt b/Documentation/devicetree/bindings/rtc/google,goldfish-rtc.txt
new file mode 100644
index 0000000..634312d
--- /dev/null
+++ b/Documentation/devicetree/bindings/rtc/google,goldfish-rtc.txt
@@ -0,0 +1,17 @@
+Android Goldfish RTC
+
+Android Goldfish RTC device used by Android emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-rtc"
+- reg : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example:
+
+ goldfish_timer@9020000 {
+ compatible = "google,goldfish-rtc";
+ reg = <0x9020000 0x1000>;
+ interrupts = <0x3>;
+ };
diff --git a/Documentation/devicetree/bindings/scheduler/sched-energy-costs.txt b/Documentation/devicetree/bindings/scheduler/sched-energy-costs.txt
new file mode 100644
index 0000000..11216f0
--- /dev/null
+++ b/Documentation/devicetree/bindings/scheduler/sched-energy-costs.txt
@@ -0,0 +1,360 @@
+===========================================================
+Energy cost bindings for Energy Aware Scheduling
+===========================================================
+
+===========================================================
+1 - Introduction
+===========================================================
+
+This note specifies bindings required for energy-aware scheduling
+(EAS)[1]. Historically, the scheduler's primary objective has been
+performance. EAS aims to provide an alternative objective - energy
+efficiency. EAS relies on a simple platform energy cost model to
+guide scheduling decisions. The model only considers the CPU
+subsystem.
+
+This note is aligned with the definition of the layout of physical
+CPUs in the system as described in the ARM topology binding
+description [2]. The concept is applicable to any system so long as
+the cost model data is provided for those processing elements in
+that system's topology that EAS is required to service.
+
+Processing elements refer to hardware threads, CPUs and clusters of
+related CPUs in increasing order of hierarchy.
+
+EAS requires two key cost metrics - busy costs and idle costs. Busy
+costs comprise of a list of compute capacities for the processing
+element in question and the corresponding power consumption at that
+capacity. Idle costs comprise of a list of power consumption values
+for each idle state [C-state] that the processing element supports.
+For a detailed description of these metrics, their derivation and
+their use see [3].
+
+These cost metrics are required for processing elements in all
+scheduling domain levels that EAS is required to service.
+
+===========================================================
+2 - energy-costs node
+===========================================================
+
+Energy costs for the processing elements in scheduling domains that
+EAS is required to service are defined in the energy-costs node
+which acts as a container for the actual per processing element cost
+nodes. A single energy-costs node is required for a given system.
+
+- energy-costs node
+
+ Usage: Required
+
+ Description: The energy-costs node is a container node and
+ it's sub-nodes describe costs for each processing element at
+ all scheduling domain levels that EAS is required to
+ service.
+
+ Node name must be "energy-costs".
+
+ The energy-costs node's parent node must be the cpus node.
+
+ The energy-costs node's child nodes can be:
+
+ - one or more cost nodes.
+
+ Any other configuration is considered invalid.
+
+The energy-costs node can only contain a single type of child node
+whose bindings are described in paragraph 4.
+
+===========================================================
+3 - energy-costs node child nodes naming convention
+===========================================================
+
+energy-costs child nodes must follow a naming convention where the
+node name must be "thread-costN", "core-costN", "cluster-costN"
+depending on whether the costs in the node are for a thread, core or
+cluster. N (where N = {0, 1, ...}) is the node number and has no
+bearing to the OS' logical thread, core or cluster index.
+
+===========================================================
+4 - cost node bindings
+===========================================================
+
+Bindings for cost nodes are defined as follows:
+
+- cluster-cost node
+
+ Description: must be declared within an energy-costs node. A
+ system can contain multiple clusters and each cluster
+ serviced by EAS must have a corresponding cluster-costs
+ node.
+
+ The cluster-cost node name must be "cluster-costN" as
+ described in 3 above.
+
+ A cluster-cost node must be a leaf node with no children.
+
+ Properties for cluster-cost nodes are described in paragraph
+ 5 below.
+
+ Any other configuration is considered invalid.
+
+- core-cost node
+
+ Description: must be declared within an energy-costs node. A
+ system can contain multiple cores and each core serviced by
+ EAS must have a corresponding core-cost node.
+
+ The core-cost node name must be "core-costN" as described in
+ 3 above.
+
+ A core-cost node must be a leaf node with no children.
+
+ Properties for core-cost nodes are described in paragraph
+ 5 below.
+
+ Any other configuration is considered invalid.
+
+- thread-cost node
+
+ Description: must be declared within an energy-costs node. A
+ system can contain cores with multiple hardware threads and
+ each thread serviced by EAS must have a corresponding
+ thread-cost node.
+
+ The core-cost node name must be "core-costN" as described in
+ 3 above.
+
+ A core-cost node must be a leaf node with no children.
+
+ Properties for thread-cost nodes are described in paragraph
+ 5 below.
+
+ Any other configuration is considered invalid.
+
+===========================================================
+5 - Cost node properties
+==========================================================
+
+All cost node types must have only the following properties:
+
+- busy-cost-data
+
+ Usage: required
+ Value type: An array of 2-item tuples. Each item is of type
+ u32.
+ Definition: The first item in the tuple is the capacity
+ value as described in [3]. The second item in the tuple is
+ the energy cost value as described in [3].
+
+- idle-cost-data
+
+ Usage: required
+ Value type: An array of 1-item tuples. The item is of type
+ u32.
+ Definition: The item in the tuple is the energy cost value
+ as described in [3].
+
+===========================================================
+4 - Extensions to the cpu node
+===========================================================
+
+The cpu node is extended with a property that establishes the
+connection between the processing element represented by the cpu
+node and the cost-nodes associated with this processing element.
+
+The connection is expressed in line with the topological hierarchy
+that this processing element belongs to starting with the level in
+the hierarchy that this processing element itself belongs to through
+to the highest level that EAS is required to service. The
+connection cannot be sparse and must be contiguous from the
+processing element's level through to the highest desired level. The
+highest desired level must be the same for all processing elements.
+
+Example: Given that a cpu node may represent a thread that is a part
+of a core, this property may contain multiple elements which
+associate the thread with cost nodes describing the costs for the
+thread itself, the core the thread belongs to, the cluster the core
+belongs to and so on. The elements must be ordered from the lowest
+level nodes to the highest desired level that EAS must service. The
+highest desired level must be the same for all cpu nodes. The
+elements must not be sparse: there must be elements for the current
+thread, the next level of hierarchy (core) and so on without any
+'holes'.
+
+Example: Given that a cpu node may represent a core that is a part
+of a cluster of related cpus this property may contain multiple
+elements which associate the core with cost nodes describing the
+costs for the core itself, the cluster the core belongs to and so
+on. The elements must be ordered from the lowest level nodes to the
+highest desired level that EAS must service. The highest desired
+level must be the same for all cpu nodes. The elements must not be
+sparse: there must be elements for the current thread, the next
+level of hierarchy (core) and so on without any 'holes'.
+
+If the system comprises of hierarchical clusters of clusters, this
+property will contain multiple associations with the relevant number
+of cluster elements in hierarchical order.
+
+Property added to the cpu node:
+
+- sched-energy-costs
+
+ Usage: required
+ Value type: List of phandles
+ Definition: a list of phandles to specific cost nodes in the
+ energy-costs parent node that correspond to the processing
+ element represented by this cpu node in hierarchical order
+ of topology.
+
+ The order of phandles in the list is significant. The first
+ phandle is to the current processing element's own cost
+ node. Subsequent phandles are to higher hierarchical level
+ cost nodes up until the maximum level that EAS is to
+ service.
+
+ All cpu nodes must have the same highest level cost node.
+
+ The phandle list must not be sparsely populated with handles
+ to non-contiguous hierarchical levels. See commentary above
+ for clarity.
+
+ Any other configuration is invalid.
+
+===========================================================
+5 - Example dts
+===========================================================
+
+Example 1 (ARM 64-bit, 6-cpu system, two clusters of cpus, one
+cluster of 2 Cortex-A57 cpus, one cluster of 4 Cortex-A53 cpus):
+
+cpus {
+ #address-cells = <2>;
+ #size-cells = <0>;
+ .
+ .
+ .
+ A57_0: cpu@0 {
+ compatible = "arm,cortex-a57","arm,armv8";
+ reg = <0x0 0x0>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A57_L2>;
+ clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_0 &CLUSTER_COST_0>;
+ };
+
+ A57_1: cpu@1 {
+ compatible = "arm,cortex-a57","arm,armv8";
+ reg = <0x0 0x1>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A57_L2>;
+ clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_0 &CLUSTER_COST_0>;
+ };
+
+ A53_0: cpu@100 {
+ compatible = "arm,cortex-a53","arm,armv8";
+ reg = <0x0 0x100>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A53_L2>;
+ clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_1 &CLUSTER_COST_1>;
+ };
+
+ A53_1: cpu@101 {
+ compatible = "arm,cortex-a53","arm,armv8";
+ reg = <0x0 0x101>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A53_L2>;
+ clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_1 &CLUSTER_COST_1>;
+ };
+
+ A53_2: cpu@102 {
+ compatible = "arm,cortex-a53","arm,armv8";
+ reg = <0x0 0x102>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A53_L2>;
+ clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_1 &CLUSTER_COST_1>;
+ };
+
+ A53_3: cpu@103 {
+ compatible = "arm,cortex-a53","arm,armv8";
+ reg = <0x0 0x103>;
+ device_type = "cpu";
+ enable-method = "psci";
+ next-level-cache = <&A53_L2>;
+ clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_1 &CLUSTER_COST_1>;
+ };
+
+ energy-costs {
+ CPU_COST_0: core-cost0 {
+ busy-cost-data = <
+ 417 168
+ 579 251
+ 744 359
+ 883 479
+ 1024 616
+ >;
+ idle-cost-data = <
+ 15
+ 0
+ >;
+ };
+ CPU_COST_1: core-cost1 {
+ busy-cost-data = <
+ 235 33
+ 302 46
+ 368 61
+ 406 76
+ 447 93
+ >;
+ idle-cost-data = <
+ 6
+ 0
+ >;
+ };
+ CLUSTER_COST_0: cluster-cost0 {
+ busy-cost-data = <
+ 417 24
+ 579 32
+ 744 43
+ 883 49
+ 1024 64
+ >;
+ idle-cost-data = <
+ 65
+ 24
+ >;
+ };
+ CLUSTER_COST_1: cluster-cost1 {
+ busy-cost-data = <
+ 235 26
+ 303 30
+ 368 39
+ 406 47
+ 447 57
+ >;
+ idle-cost-data = <
+ 56
+ 17
+ >;
+ };
+ };
+};
+
+===============================================================================
+[1] https://lkml.org/lkml/2015/5/12/728
+[2] Documentation/devicetree/bindings/topology.txt
+[3] Documentation/scheduler/sched-energy.txt
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 98dc175..7004fa9 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -128,6 +128,7 @@
lantiq Lantiq Semiconductor
lenovo Lenovo Group Ltd.
lg LG Corporation
+linaro Linaro Limited
linux Linux-specific binding
lsi LSI Corp. (LSI Logic)
lltc Linear Technology Corporation
diff --git a/Documentation/features/time/irq-time-acct/arch-support.txt b/Documentation/features/time/irq-time-acct/arch-support.txt
index e633162..4199ffec 100644
--- a/Documentation/features/time/irq-time-acct/arch-support.txt
+++ b/Documentation/features/time/irq-time-acct/arch-support.txt
@@ -9,7 +9,7 @@
| alpha: | .. |
| arc: | TODO |
| arm: | ok |
- | arm64: | .. |
+ | arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
diff --git a/Documentation/features/vm/huge-vmap/arch-support.txt b/Documentation/features/vm/huge-vmap/arch-support.txt
index af6816bcc..df1d1f3 100644
--- a/Documentation/features/vm/huge-vmap/arch-support.txt
+++ b/Documentation/features/vm/huge-vmap/arch-support.txt
@@ -9,7 +9,7 @@
| alpha: | TODO |
| arc: | TODO |
| arm: | TODO |
- | arm64: | TODO |
+ | arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt
index b102b43..ecccb51c 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -102,14 +102,16 @@
collection, triggered in background when I/O subsystem is
idle. If background_gc=on, it will turn on the garbage
collection and if background_gc=off, garbage collection
- will be truned off. If background_gc=sync, it will turn
+ will be turned off. If background_gc=sync, it will turn
on synchronous garbage collection running in background.
Default value for this option is on. So garbage
collection is on by default.
disable_roll_forward Disable the roll-forward recovery routine
norecovery Disable the roll-forward recovery routine, mounted read-
only (i.e., -o ro,disable_roll_forward)
-discard Issue discard/TRIM commands when a segment is cleaned.
+discard/nodiscard Enable/disable real-time discard in f2fs, if discard is
+ enabled, f2fs will issue discard/TRIM commands when a
+ segment is cleaned.
no_heap Disable heap-style segment allocation which finds free
segments for data from the beginning of main area, while
for node from the end of main area.
@@ -123,12 +125,14 @@
disable_ext_identify Disable the extension list configured by mkfs, so f2fs
does not aware of cold files such as media files.
inline_xattr Enable the inline xattrs feature.
+noinline_xattr Disable the inline xattrs feature.
inline_data Enable the inline data feature: New created small(<~3.4k)
files can be written into inode block.
inline_dentry Enable the inline dir feature: data in new created
directory entries can be written into inode block. The
space of inode block which is used to store inline
dentries is limited to ~3.4k.
+noinline_dentry Diable the inline dentry feature.
flush_merge Merge concurrent cache_flush commands as much as possible
to eliminate redundant command issues. If the underlying
device handles the cache_flush command relatively slowly,
@@ -145,10 +149,48 @@
as many as extent which map between contiguous logical
address and physical address per inode, resulting in
increasing the cache hit ratio. Set by default.
-noextent_cache Diable an extent cache based on rb-tree explicitly, see
+noextent_cache Disable an extent cache based on rb-tree explicitly, see
the above extent_cache mount option.
noinline_data Disable the inline data feature, inline data feature is
enabled by default.
+data_flush Enable data flushing before checkpoint in order to
+ persist data of regular and symlink.
+mode=%s Control block allocation mode which supports "adaptive"
+ and "lfs". In "lfs" mode, there should be no random
+ writes towards main area.
+io_bits=%u Set the bit size of write IO requests. It should be set
+ with "mode=lfs".
+usrquota Enable plain user disk quota accounting.
+grpquota Enable plain group disk quota accounting.
+prjquota Enable plain project quota accounting.
+usrjquota=<file> Appoint specified file and type during mount, so that quota
+grpjquota=<file> information can be properly updated during recovery flow,
+prjjquota=<file> <quota file>: must be in root directory;
+jqfmt=<quota type> <quota type>: [vfsold,vfsv0,vfsv1].
+offusrjquota Turn off user journelled quota.
+offgrpjquota Turn off group journelled quota.
+offprjjquota Turn off project journelled quota.
+quota Enable plain user disk quota accounting.
+noquota Disable all plain disk quota option.
+whint_mode=%s Control which write hints are passed down to block
+ layer. This supports "off", "user-based", and
+ "fs-based". In "off" mode (default), f2fs does not pass
+ down hints. In "user-based" mode, f2fs tries to pass
+ down hints given by users. And in "fs-based" mode, f2fs
+ passes down hints with its policy.
+alloc_mode=%s Adjust block allocation policy, which supports "reuse"
+ and "default".
+fsync_mode=%s Control the policy of fsync. Currently supports "posix",
+ "strict", and "nobarrier". In "posix" mode, which is
+ default, fsync will follow POSIX semantics and does a
+ light operation to improve the filesystem performance.
+ In "strict" mode, fsync will be heavy and behaves in line
+ with xfs, ext4 and btrfs, where xfstest generic/342 will
+ pass, but the performance will regress. "nobarrier" is
+ based on "posix", but doesn't issue flush command for
+ non-atomic files likewise "nobarrier" mount option.
+test_dummy_encryption Enable dummy encryption, which provides a fake fscrypt
+ context. The fake fscrypt context is used by xfstests.
================================================================================
DEBUGFS ENTRIES
@@ -192,7 +234,16 @@
policy for garbage collection. Setting gc_idle = 0
(default) will disable this option. Setting
gc_idle = 1 will select the Cost Benefit approach
- & setting gc_idle = 2 will select the greedy aproach.
+ & setting gc_idle = 2 will select the greedy approach.
+
+ gc_urgent This parameter controls triggering background GCs
+ urgently or not. Setting gc_urgent = 0 [default]
+ makes back to default behavior, while if it is set
+ to 1, background thread starts to do GC by given
+ gc_urgent_sleep_time interval.
+
+ gc_urgent_sleep_time This parameter controls sleep time for gc_urgent.
+ 500 ms is set by default. See above gc_urgent.
reclaim_segments This parameter controls the number of prefree
segments to be reclaimed. If the number of prefree
@@ -298,7 +349,7 @@
file. Each file is dump_ssa and dump_sit.
The dump.f2fs is used to debug on-disk data structures of the f2fs filesystem.
-It shows on-disk inode information reconized by a given inode number, and is
+It shows on-disk inode information recognized by a given inode number, and is
able to dump all the SSA and SIT entries into predefined files, ./dump_ssa and
./dump_sit respectively.
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
new file mode 100644
index 0000000..48b424d
--- /dev/null
+++ b/Documentation/filesystems/fscrypt.rst
@@ -0,0 +1,626 @@
+=====================================
+Filesystem-level encryption (fscrypt)
+=====================================
+
+Introduction
+============
+
+fscrypt is a library which filesystems can hook into to support
+transparent encryption of files and directories.
+
+Note: "fscrypt" in this document refers to the kernel-level portion,
+implemented in ``fs/crypto/``, as opposed to the userspace tool
+`fscrypt <https://github.com/google/fscrypt>`_. This document only
+covers the kernel-level portion. For command-line examples of how to
+use encryption, see the documentation for the userspace tool `fscrypt
+<https://github.com/google/fscrypt>`_. Also, it is recommended to use
+the fscrypt userspace tool, or other existing userspace tools such as
+`fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key
+management system
+<https://source.android.com/security/encryption/file-based>`_, over
+using the kernel's API directly. Using existing tools reduces the
+chance of introducing your own security bugs. (Nevertheless, for
+completeness this documentation covers the kernel's API anyway.)
+
+Unlike dm-crypt, fscrypt operates at the filesystem level rather than
+at the block device level. This allows it to encrypt different files
+with different keys and to have unencrypted files on the same
+filesystem. This is useful for multi-user systems where each user's
+data-at-rest needs to be cryptographically isolated from the others.
+However, except for filenames, fscrypt does not encrypt filesystem
+metadata.
+
+Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
+directly into supported filesystems --- currently ext4, F2FS, and
+UBIFS. This allows encrypted files to be read and written without
+caching both the decrypted and encrypted pages in the pagecache,
+thereby nearly halving the memory used and bringing it in line with
+unencrypted files. Similarly, half as many dentries and inodes are
+needed. eCryptfs also limits encrypted filenames to 143 bytes,
+causing application compatibility issues; fscrypt allows the full 255
+bytes (NAME_MAX). Finally, unlike eCryptfs, the fscrypt API can be
+used by unprivileged users, with no need to mount anything.
+
+fscrypt does not support encrypting files in-place. Instead, it
+supports marking an empty directory as encrypted. Then, after
+userspace provides the key, all regular files, directories, and
+symbolic links created in that directory tree are transparently
+encrypted.
+
+Threat model
+============
+
+Offline attacks
+---------------
+
+Provided that userspace chooses a strong encryption key, fscrypt
+protects the confidentiality of file contents and filenames in the
+event of a single point-in-time permanent offline compromise of the
+block device content. fscrypt does not protect the confidentiality of
+non-filename metadata, e.g. file sizes, file permissions, file
+timestamps, and extended attributes. Also, the existence and location
+of holes (unallocated blocks which logically contain all zeroes) in
+files is not protected.
+
+fscrypt is not guaranteed to protect confidentiality or authenticity
+if an attacker is able to manipulate the filesystem offline prior to
+an authorized user later accessing the filesystem.
+
+Online attacks
+--------------
+
+fscrypt (and storage encryption in general) can only provide limited
+protection, if any at all, against online attacks. In detail:
+
+fscrypt is only resistant to side-channel attacks, such as timing or
+electromagnetic attacks, to the extent that the underlying Linux
+Cryptographic API algorithms are. If a vulnerable algorithm is used,
+such as a table-based implementation of AES, it may be possible for an
+attacker to mount a side channel attack against the online system.
+Side channel attacks may also be mounted against applications
+consuming decrypted data.
+
+After an encryption key has been provided, fscrypt is not designed to
+hide the plaintext file contents or filenames from other users on the
+same system, regardless of the visibility of the keyring key.
+Instead, existing access control mechanisms such as file mode bits,
+POSIX ACLs, LSMs, or mount namespaces should be used for this purpose.
+Also note that as long as the encryption keys are *anywhere* in
+memory, an online attacker can necessarily compromise them by mounting
+a physical attack or by exploiting any kernel security vulnerability
+which provides an arbitrary memory read primitive.
+
+While it is ostensibly possible to "evict" keys from the system,
+recently accessed encrypted files will remain accessible at least
+until the filesystem is unmounted or the VFS caches are dropped, e.g.
+using ``echo 2 > /proc/sys/vm/drop_caches``. Even after that, if the
+RAM is compromised before being powered off, it will likely still be
+possible to recover portions of the plaintext file contents, if not
+some of the encryption keys as well. (Since Linux v4.12, all
+in-kernel keys related to fscrypt are sanitized before being freed.
+However, userspace would need to do its part as well.)
+
+Currently, fscrypt does not prevent a user from maliciously providing
+an incorrect key for another user's existing encrypted files. A
+protection against this is planned.
+
+Key hierarchy
+=============
+
+Master Keys
+-----------
+
+Each encrypted directory tree is protected by a *master key*. Master
+keys can be up to 64 bytes long, and must be at least as long as the
+greater of the key length needed by the contents and filenames
+encryption modes being used. For example, if AES-256-XTS is used for
+contents encryption, the master key must be 64 bytes (512 bits). Note
+that the XTS mode is defined to require a key twice as long as that
+required by the underlying block cipher.
+
+To "unlock" an encrypted directory tree, userspace must provide the
+appropriate master key. There can be any number of master keys, each
+of which protects any number of directory trees on any number of
+filesystems.
+
+Userspace should generate master keys either using a cryptographically
+secure random number generator, or by using a KDF (Key Derivation
+Function). Note that whenever a KDF is used to "stretch" a
+lower-entropy secret such as a passphrase, it is critical that a KDF
+designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
+
+Per-file keys
+-------------
+
+Master keys are not used to encrypt file contents or names directly.
+Instead, a unique key is derived for each encrypted file, including
+each regular file, directory, and symbolic link. This has several
+advantages:
+
+- In cryptosystems, the same key material should never be used for
+ different purposes. Using the master key as both an XTS key for
+ contents encryption and as a CTS-CBC key for filenames encryption
+ would violate this rule.
+- Per-file keys simplify the choice of IVs (Initialization Vectors)
+ for contents encryption. Without per-file keys, to ensure IV
+ uniqueness both the inode and logical block number would need to be
+ encoded in the IVs. This would make it impossible to renumber
+ inodes, which e.g. ``resize2fs`` can do when resizing an ext4
+ filesystem. With per-file keys, it is sufficient to encode just the
+ logical block number in the IVs.
+- Per-file keys strengthen the encryption of filenames, where IVs are
+ reused out of necessity. With a unique key per directory, IV reuse
+ is limited to within a single directory.
+- Per-file keys allow individual files to be securely erased simply by
+ securely erasing their keys. (Not yet implemented.)
+
+A KDF (Key Derivation Function) is used to derive per-file keys from
+the master key. This is done instead of wrapping a randomly-generated
+key for each file because it reduces the size of the encryption xattr,
+which for some filesystems makes the xattr more likely to fit in-line
+in the filesystem's inode table. With a KDF, only a 16-byte nonce is
+required --- long enough to make key reuse extremely unlikely. A
+wrapped key, on the other hand, would need to be up to 64 bytes ---
+the length of an AES-256-XTS key. Furthermore, currently there is no
+requirement to support unlocking a file with multiple alternative
+master keys or to support rotating master keys. Instead, the master
+keys may be wrapped in userspace, e.g. as done by the `fscrypt
+<https://github.com/google/fscrypt>`_ tool.
+
+The current KDF encrypts the master key using the 16-byte nonce as an
+AES-128-ECB key. The output is used as the derived key. If the
+output is longer than needed, then it is truncated to the needed
+length. Truncation is the norm for directories and symlinks, since
+those use the CTS-CBC encryption mode which requires a key half as
+long as that required by the XTS encryption mode.
+
+Note: this KDF meets the primary security requirement, which is to
+produce unique derived keys that preserve the entropy of the master
+key, assuming that the master key is already a good pseudorandom key.
+However, it is nonstandard and has some problems such as being
+reversible, so it is generally considered to be a mistake! It may be
+replaced with HKDF or another more standard KDF in the future.
+
+Encryption modes and usage
+==========================
+
+fscrypt allows one encryption mode to be specified for file contents
+and one encryption mode to be specified for filenames. Different
+directory trees are permitted to use different encryption modes.
+Currently, the following pairs of encryption modes are supported:
+
+- AES-256-XTS for contents and AES-256-CTS-CBC for filenames
+- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
+- Speck128/256-XTS for contents and Speck128/256-CTS-CBC for filenames
+
+It is strongly recommended to use AES-256-XTS for contents encryption.
+AES-128-CBC was added only for low-powered embedded devices with
+crypto accelerators such as CAAM or CESA that do not support XTS.
+
+Similarly, Speck128/256 support was only added for older or low-end
+CPUs which cannot do AES fast enough -- especially ARM CPUs which have
+NEON instructions but not the Cryptography Extensions -- and for which
+it would not otherwise be feasible to use encryption at all. It is
+not recommended to use Speck on CPUs that have AES instructions.
+Speck support is only available if it has been enabled in the crypto
+API via CONFIG_CRYPTO_SPECK. Also, on ARM platforms, to get
+acceptable performance CONFIG_CRYPTO_SPECK_NEON must be enabled.
+
+New encryption modes can be added relatively easily, without changes
+to individual filesystems. However, authenticated encryption (AE)
+modes are not currently supported because of the difficulty of dealing
+with ciphertext expansion.
+
+For file contents, each filesystem block is encrypted independently.
+Currently, only the case where the filesystem block size is equal to
+the system's page size (usually 4096 bytes) is supported. With the
+XTS mode of operation (recommended), the logical block number within
+the file is used as the IV. With the CBC mode of operation (not
+recommended), ESSIV is used; specifically, the IV for CBC is the
+logical block number encrypted with AES-256, where the AES-256 key is
+the SHA-256 hash of the inode's data encryption key.
+
+For filenames, the full filename is encrypted at once. Because of the
+requirements to retain support for efficient directory lookups and
+filenames of up to 255 bytes, a constant initialization vector (IV) is
+used. However, each encrypted directory uses a unique key, which
+limits IV reuse to within a single directory. Note that IV reuse in
+the context of CTS-CBC encryption means that when the original
+filenames share a common prefix at least as long as the cipher block
+size (16 bytes for AES), the corresponding encrypted filenames will
+also share a common prefix. This is undesirable; it may be fixed in
+the future by switching to an encryption mode that is a strong
+pseudorandom permutation on arbitrary-length messages, e.g. the HEH
+(Hash-Encrypt-Hash) mode.
+
+Since filenames are encrypted with the CTS-CBC mode of operation, the
+plaintext and ciphertext filenames need not be multiples of the AES
+block size, i.e. 16 bytes. However, the minimum size that can be
+encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes
+before being encrypted. In addition, to reduce leakage of filename
+lengths via their ciphertexts, all filenames are NUL-padded to the
+next 4, 8, 16, or 32-byte boundary (configurable). 32 is recommended
+since this provides the best confidentiality, at the cost of making
+directory entries consume slightly more space. Note that since NUL
+(``\0``) is not otherwise a valid character in filenames, the padding
+will never produce duplicate plaintexts.
+
+Symbolic link targets are considered a type of filename and are
+encrypted in the same way as filenames in directory entries. Each
+symlink also uses a unique key; hence, the hardcoded IV is not a
+problem for symlinks.
+
+User API
+========
+
+Setting an encryption policy
+----------------------------
+
+The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an
+empty directory or verifies that a directory or regular file already
+has the specified encryption policy. It takes in a pointer to a
+:c:type:`struct fscrypt_policy`, defined as follows::
+
+ #define FS_KEY_DESCRIPTOR_SIZE 8
+
+ struct fscrypt_policy {
+ __u8 version;
+ __u8 contents_encryption_mode;
+ __u8 filenames_encryption_mode;
+ __u8 flags;
+ __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
+ };
+
+This structure must be initialized as follows:
+
+- ``version`` must be 0.
+
+- ``contents_encryption_mode`` and ``filenames_encryption_mode`` must
+ be set to constants from ``<linux/fs.h>`` which identify the
+ encryption modes to use. If unsure, use
+ FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode``
+ and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
+ ``filenames_encryption_mode``.
+
+- ``flags`` must be set to a value from ``<linux/fs.h>`` which
+ identifies the amount of NUL-padding to use when encrypting
+ filenames. If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
+
+- ``master_key_descriptor`` specifies how to find the master key in
+ the keyring; see `Adding keys`_. It is up to userspace to choose a
+ unique ``master_key_descriptor`` for each master key. The e4crypt
+ and fscrypt tools use the first 8 bytes of
+ ``SHA-512(SHA-512(master_key))``, but this particular scheme is not
+ required. Also, the master key need not be in the keyring yet when
+ FS_IOC_SET_ENCRYPTION_POLICY is executed. However, it must be added
+ before any files can be created in the encrypted directory.
+
+If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY
+verifies that the file is an empty directory. If so, the specified
+encryption policy is assigned to the directory, turning it into an
+encrypted directory. After that, and after providing the
+corresponding master key as described in `Adding keys`_, all regular
+files, directories (recursively), and symlinks created in the
+directory will be encrypted, inheriting the same encryption policy.
+The filenames in the directory's entries will be encrypted as well.
+
+Alternatively, if the file is already encrypted, then
+FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption
+policy exactly matches the actual one. If they match, then the ioctl
+returns 0. Otherwise, it fails with EEXIST. This works on both
+regular files and directories, including nonempty directories.
+
+Note that the ext4 filesystem does not allow the root directory to be
+encrypted, even if it is empty. Users who want to encrypt an entire
+filesystem with one key should consider using dm-crypt instead.
+
+FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors:
+
+- ``EACCES``: the file is not owned by the process's uid, nor does the
+ process have the CAP_FOWNER capability in a namespace with the file
+ owner's uid mapped
+- ``EEXIST``: the file is already encrypted with an encryption policy
+ different from the one specified
+- ``EINVAL``: an invalid encryption policy was specified (invalid
+ version, mode(s), or flags)
+- ``ENOTDIR``: the file is unencrypted and is a regular file, not a
+ directory
+- ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory
+- ``ENOTTY``: this type of filesystem does not implement encryption
+- ``EOPNOTSUPP``: the kernel was not configured with encryption
+ support for this filesystem, or the filesystem superblock has not
+ had encryption enabled on it. (For example, to use encryption on an
+ ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the
+ kernel config, and the superblock must have had the "encrypt"
+ feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O
+ encrypt``.)
+- ``EPERM``: this directory may not be encrypted, e.g. because it is
+ the root directory of an ext4 filesystem
+- ``EROFS``: the filesystem is readonly
+
+Getting an encryption policy
+----------------------------
+
+The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct
+fscrypt_policy`, if any, for a directory or regular file. See above
+for the struct definition. No additional permissions are required
+beyond the ability to open the file.
+
+FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors:
+
+- ``EINVAL``: the file is encrypted, but it uses an unrecognized
+ encryption context format
+- ``ENODATA``: the file is not encrypted
+- ``ENOTTY``: this type of filesystem does not implement encryption
+- ``EOPNOTSUPP``: the kernel was not configured with encryption
+ support for this filesystem
+
+Note: if you only need to know whether a file is encrypted or not, on
+most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl
+and check for FS_ENCRYPT_FL, or to use the statx() system call and
+check for STATX_ATTR_ENCRYPTED in stx_attributes.
+
+Getting the per-filesystem salt
+-------------------------------
+
+Some filesystems, such as ext4 and F2FS, also support the deprecated
+ioctl FS_IOC_GET_ENCRYPTION_PWSALT. This ioctl retrieves a randomly
+generated 16-byte value stored in the filesystem superblock. This
+value is intended to used as a salt when deriving an encryption key
+from a passphrase or other low-entropy user credential.
+
+FS_IOC_GET_ENCRYPTION_PWSALT is deprecated. Instead, prefer to
+generate and manage any needed salt(s) in userspace.
+
+Adding keys
+-----------
+
+To provide a master key, userspace must add it to an appropriate
+keyring using the add_key() system call (see:
+``Documentation/security/keys/core.rst``). The key type must be
+"logon"; keys of this type are kept in kernel memory and cannot be
+read back by userspace. The key description must be "fscrypt:"
+followed by the 16-character lower case hex representation of the
+``master_key_descriptor`` that was set in the encryption policy. The
+key payload must conform to the following structure::
+
+ #define FS_MAX_KEY_SIZE 64
+
+ struct fscrypt_key {
+ u32 mode;
+ u8 raw[FS_MAX_KEY_SIZE];
+ u32 size;
+ };
+
+``mode`` is ignored; just set it to 0. The actual key is provided in
+``raw`` with ``size`` indicating its size in bytes. That is, the
+bytes ``raw[0..size-1]`` (inclusive) are the actual key.
+
+The key description prefix "fscrypt:" may alternatively be replaced
+with a filesystem-specific prefix such as "ext4:". However, the
+filesystem-specific prefixes are deprecated and should not be used in
+new programs.
+
+There are several different types of keyrings in which encryption keys
+may be placed, such as a session keyring, a user session keyring, or a
+user keyring. Each key must be placed in a keyring that is "attached"
+to all processes that might need to access files encrypted with it, in
+the sense that request_key() will find the key. Generally, if only
+processes belonging to a specific user need to access a given
+encrypted directory and no session keyring has been installed, then
+that directory's key should be placed in that user's user session
+keyring or user keyring. Otherwise, a session keyring should be
+installed if needed, and the key should be linked into that session
+keyring, or in a keyring linked into that session keyring.
+
+Note: introducing the complex visibility semantics of keyrings here
+was arguably a mistake --- especially given that by design, after any
+process successfully opens an encrypted file (thereby setting up the
+per-file key), possessing the keyring key is not actually required for
+any process to read/write the file until its in-memory inode is
+evicted. In the future there probably should be a way to provide keys
+directly to the filesystem instead, which would make the intended
+semantics clearer.
+
+Access semantics
+================
+
+With the key
+------------
+
+With the encryption key, encrypted regular files, directories, and
+symlinks behave very similarly to their unencrypted counterparts ---
+after all, the encryption is intended to be transparent. However,
+astute users may notice some differences in behavior:
+
+- Unencrypted files, or files encrypted with a different encryption
+ policy (i.e. different key, modes, or flags), cannot be renamed or
+ linked into an encrypted directory; see `Encryption policy
+ enforcement`_. Attempts to do so will fail with EPERM. However,
+ encrypted files can be renamed within an encrypted directory, or
+ into an unencrypted directory.
+
+- Direct I/O is not supported on encrypted files. Attempts to use
+ direct I/O on such files will fall back to buffered I/O.
+
+- The fallocate operations FALLOC_FL_COLLAPSE_RANGE,
+ FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported
+ on encrypted files and will fail with EOPNOTSUPP.
+
+- Online defragmentation of encrypted files is not supported. The
+ EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with
+ EOPNOTSUPP.
+
+- The ext4 filesystem does not support data journaling with encrypted
+ regular files. It will fall back to ordered data mode instead.
+
+- DAX (Direct Access) is not supported on encrypted files.
+
+- The st_size of an encrypted symlink will not necessarily give the
+ length of the symlink target as required by POSIX. It will actually
+ give the length of the ciphertext, which will be slightly longer
+ than the plaintext due to NUL-padding and an extra 2-byte overhead.
+
+- The maximum length of an encrypted symlink is 2 bytes shorter than
+ the maximum length of an unencrypted symlink. For example, on an
+ EXT4 filesystem with a 4K block size, unencrypted symlinks can be up
+ to 4095 bytes long, while encrypted symlinks can only be up to 4093
+ bytes long (both lengths excluding the terminating null).
+
+Note that mmap *is* supported. This is possible because the pagecache
+for an encrypted file contains the plaintext, not the ciphertext.
+
+Without the key
+---------------
+
+Some filesystem operations may be performed on encrypted regular
+files, directories, and symlinks even before their encryption key has
+been provided:
+
+- File metadata may be read, e.g. using stat().
+
+- Directories may be listed, in which case the filenames will be
+ listed in an encoded form derived from their ciphertext. The
+ current encoding algorithm is described in `Filename hashing and
+ encoding`_. The algorithm is subject to change, but it is
+ guaranteed that the presented filenames will be no longer than
+ NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and
+ will uniquely identify directory entries.
+
+ The ``.`` and ``..`` directory entries are special. They are always
+ present and are not encrypted or encoded.
+
+- Files may be deleted. That is, nondirectory files may be deleted
+ with unlink() as usual, and empty directories may be deleted with
+ rmdir() as usual. Therefore, ``rm`` and ``rm -r`` will work as
+ expected.
+
+- Symlink targets may be read and followed, but they will be presented
+ in encrypted form, similar to filenames in directories. Hence, they
+ are unlikely to point to anywhere useful.
+
+Without the key, regular files cannot be opened or truncated.
+Attempts to do so will fail with ENOKEY. This implies that any
+regular file operations that require a file descriptor, such as
+read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden.
+
+Also without the key, files of any type (including directories) cannot
+be created or linked into an encrypted directory, nor can a name in an
+encrypted directory be the source or target of a rename, nor can an
+O_TMPFILE temporary file be created in an encrypted directory. All
+such operations will fail with ENOKEY.
+
+It is not currently possible to backup and restore encrypted files
+without the encryption key. This would require special APIs which
+have not yet been implemented.
+
+Encryption policy enforcement
+=============================
+
+After an encryption policy has been set on a directory, all regular
+files, directories, and symbolic links created in that directory
+(recursively) will inherit that encryption policy. Special files ---
+that is, named pipes, device nodes, and UNIX domain sockets --- will
+not be encrypted.
+
+Except for those special files, it is forbidden to have unencrypted
+files, or files encrypted with a different encryption policy, in an
+encrypted directory tree. Attempts to link or rename such a file into
+an encrypted directory will fail with EPERM. This is also enforced
+during ->lookup() to provide limited protection against offline
+attacks that try to disable or downgrade encryption in known locations
+where applications may later write sensitive data. It is recommended
+that systems implementing a form of "verified boot" take advantage of
+this by validating all top-level encryption policies prior to access.
+
+Implementation details
+======================
+
+Encryption context
+------------------
+
+An encryption policy is represented on-disk by a :c:type:`struct
+fscrypt_context`. It is up to individual filesystems to decide where
+to store it, but normally it would be stored in a hidden extended
+attribute. It should *not* be exposed by the xattr-related system
+calls such as getxattr() and setxattr() because of the special
+semantics of the encryption xattr. (In particular, there would be
+much confusion if an encryption policy were to be added to or removed
+from anything other than an empty directory.) The struct is defined
+as follows::
+
+ #define FS_KEY_DESCRIPTOR_SIZE 8
+ #define FS_KEY_DERIVATION_NONCE_SIZE 16
+
+ struct fscrypt_context {
+ u8 format;
+ u8 contents_encryption_mode;
+ u8 filenames_encryption_mode;
+ u8 flags;
+ u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
+ u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
+ };
+
+Note that :c:type:`struct fscrypt_context` contains the same
+information as :c:type:`struct fscrypt_policy` (see `Setting an
+encryption policy`_), except that :c:type:`struct fscrypt_context`
+also contains a nonce. The nonce is randomly generated by the kernel
+and is used to derive the inode's encryption key as described in
+`Per-file keys`_.
+
+Data path changes
+-----------------
+
+For the read path (->readpage()) of regular files, filesystems can
+read the ciphertext into the page cache and decrypt it in-place. The
+page lock must be held until decryption has finished, to prevent the
+page from becoming visible to userspace prematurely.
+
+For the write path (->writepage()) of regular files, filesystems
+cannot encrypt data in-place in the page cache, since the cached
+plaintext must be preserved. Instead, filesystems must encrypt into a
+temporary buffer or "bounce page", then write out the temporary
+buffer. Some filesystems, such as UBIFS, already use temporary
+buffers regardless of encryption. Other filesystems, such as ext4 and
+F2FS, have to allocate bounce pages specially for encryption.
+
+Filename hashing and encoding
+-----------------------------
+
+Modern filesystems accelerate directory lookups by using indexed
+directories. An indexed directory is organized as a tree keyed by
+filename hashes. When a ->lookup() is requested, the filesystem
+normally hashes the filename being looked up so that it can quickly
+find the corresponding directory entry, if any.
+
+With encryption, lookups must be supported and efficient both with and
+without the encryption key. Clearly, it would not work to hash the
+plaintext filenames, since the plaintext filenames are unavailable
+without the key. (Hashing the plaintext filenames would also make it
+impossible for the filesystem's fsck tool to optimize encrypted
+directories.) Instead, filesystems hash the ciphertext filenames,
+i.e. the bytes actually stored on-disk in the directory entries. When
+asked to do a ->lookup() with the key, the filesystem just encrypts
+the user-supplied name to get the ciphertext.
+
+Lookups without the key are more complicated. The raw ciphertext may
+contain the ``\0`` and ``/`` characters, which are illegal in
+filenames. Therefore, readdir() must base64-encode the ciphertext for
+presentation. For most filenames, this works fine; on ->lookup(), the
+filesystem just base64-decodes the user-supplied name to get back to
+the raw ciphertext.
+
+However, for very long filenames, base64 encoding would cause the
+filename length to exceed NAME_MAX. To prevent this, readdir()
+actually presents long filenames in an abbreviated form which encodes
+a strong "hash" of the ciphertext filename, along with the optional
+filesystem-specific hash(es) needed for directory lookups. This
+allows the filesystem to still, with a high degree of confidence, map
+the filename given in ->lookup() back to a particular directory entry
+that was previously listed by readdir(). See :c:type:`struct
+fscrypt_digested_name` in the source for more details.
+
+Note that the precise way that filenames are presented to userspace
+without the key is subject to change in the future. It is only meant
+as a way to temporarily present valid filenames so that commands like
+``rm -r`` work as expected on encrypted directories.
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 6d2689e..d157847 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -43,6 +43,7 @@
3.7 /proc/<pid>/task/<tid>/children - Information about task children
3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file
3.9 /proc/<pid>/map_files - Information about memory mapped files
+ 3.10 /proc/<pid>/timerslack_ns - Task timerslack value
4 Configuring procfs
4.1 Mount options
@@ -380,6 +381,8 @@
[stack] = the stack of the main process
[vdso] = the "virtual dynamic shared object",
the kernel system call handler
+ [anon:<name>] = an anonymous mapping that has been
+ named by userspace
or if empty, the mapping is anonymous.
@@ -406,6 +409,7 @@
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd ex mr mw me dw
+Name: name from userspace
the first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps. The remaining lines show the size of the mapping
@@ -468,6 +472,9 @@
be present in all further kernel releases. Things get changed, the flags may
be vanished or the reverse -- new added.
+The "Name" field will only be present on a mapping that has been named by
+userspace, and will show the name passed in by userspace.
+
This file is only present if the CONFIG_MMU kernel configuration option is
enabled.
@@ -1821,6 +1828,23 @@
comparing their inode numbers to figure out which anonymous memory areas
are actually shared.
+3.10 /proc/<pid>/timerslack_ns - Task timerslack value
+---------------------------------------------------------
+This file provides the value of the task's timerslack value in nanoseconds.
+This value specifies a amount of time that normal timers may be deferred
+in order to coalesce timers and avoid unnecessary wakeups.
+
+This allows a task's interactivity vs power consumption trade off to be
+adjusted.
+
+Writing 0 to the file will set the tasks timerslack to the default value.
+
+Valid values are from 0 - ULLONG_MAX
+
+An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
+permissions on the task specified to change its timerslack_ns value.
+
+
------------------------------------------------------------------------------
Configuring procfs
------------------------------------------------------------------------------
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 91261a3..b5ce7b6 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -307,6 +307,7 @@
0xA3 80-8F Port ACL in development:
<mailto:tlewis@mindspring.com>
0xA3 90-9F linux/dtlk.h
+0xA4 00-1F uapi/linux/tee.h Generic TEE subsystem
0xAA 00-3F linux/uapi/linux/userfaultfd.h
0xAB 00-1F linux/nbd.h
0xAC 00-1F linux/raw.h
diff --git a/Documentation/kasan.txt b/Documentation/kasan.txt
index aa1e0c9..7dd95b3 100644
--- a/Documentation/kasan.txt
+++ b/Documentation/kasan.txt
@@ -12,8 +12,7 @@
therefore you will need a GCC version 4.9.2 or later. GCC 5.0 or later is
required for detection of out-of-bounds accesses to stack or global variables.
-Currently KASAN is supported only for x86_64 architecture and requires the
-kernel to be built with the SLUB allocator.
+Currently KASAN is supported only for x86_64 architecture.
1. Usage
========
@@ -27,7 +26,7 @@
the latter is 1.1 - 2 times faster. Inline instrumentation requires a GCC
version 5.0 or later.
-Currently KASAN works only with the SLUB memory allocator.
+KASAN works with both SLUB and SLAB memory allocators.
For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.
To disable instrumentation for specific files or directories, add a line
diff --git a/Documentation/kcov.txt b/Documentation/kcov.txt
new file mode 100644
index 0000000..779ff4a
--- /dev/null
+++ b/Documentation/kcov.txt
@@ -0,0 +1,111 @@
+kcov: code coverage for fuzzing
+===============================
+
+kcov exposes kernel code coverage information in a form suitable for coverage-
+guided fuzzing (randomized testing). Coverage data of a running kernel is
+exported via the "kcov" debugfs file. Coverage collection is enabled on a task
+basis, and thus it can capture precise coverage of a single system call.
+
+Note that kcov does not aim to collect as much coverage as possible. It aims
+to collect more or less stable coverage that is function of syscall inputs.
+To achieve this goal it does not collect coverage in soft/hard interrupts
+and instrumentation of some inherently non-deterministic parts of kernel is
+disbled (e.g. scheduler, locking).
+
+Usage:
+======
+
+Configure kernel with:
+
+ CONFIG_KCOV=y
+
+CONFIG_KCOV requires gcc built on revision 231296 or later.
+Profiling data will only become accessible once debugfs has been mounted:
+
+ mount -t debugfs none /sys/kernel/debug
+
+The following program demonstrates kcov usage from within a test program:
+
+#include <stdio.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
+#define KCOV_ENABLE _IO('c', 100)
+#define KCOV_DISABLE _IO('c', 101)
+#define COVER_SIZE (64<<10)
+
+int main(int argc, char **argv)
+{
+ int fd;
+ unsigned long *cover, n, i;
+
+ /* A single fd descriptor allows coverage collection on a single
+ * thread.
+ */
+ fd = open("/sys/kernel/debug/kcov", O_RDWR);
+ if (fd == -1)
+ perror("open"), exit(1);
+ /* Setup trace mode and trace size. */
+ if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
+ perror("ioctl"), exit(1);
+ /* Mmap buffer shared between kernel- and user-space. */
+ cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if ((void*)cover == MAP_FAILED)
+ perror("mmap"), exit(1);
+ /* Enable coverage collection on the current thread. */
+ if (ioctl(fd, KCOV_ENABLE, 0))
+ perror("ioctl"), exit(1);
+ /* Reset coverage from the tail of the ioctl() call. */
+ __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
+ /* That's the target syscal call. */
+ read(-1, NULL, 0);
+ /* Read number of PCs collected. */
+ n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
+ for (i = 0; i < n; i++)
+ printf("0x%lx\n", cover[i + 1]);
+ /* Disable coverage collection for the current thread. After this call
+ * coverage can be enabled for a different thread.
+ */
+ if (ioctl(fd, KCOV_DISABLE, 0))
+ perror("ioctl"), exit(1);
+ /* Free resources. */
+ if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
+ perror("munmap"), exit(1);
+ if (close(fd))
+ perror("close"), exit(1);
+ return 0;
+}
+
+After piping through addr2line output of the program looks as follows:
+
+SyS_read
+fs/read_write.c:562
+__fdget_pos
+fs/file.c:774
+__fget_light
+fs/file.c:746
+__fget_light
+fs/file.c:750
+__fget_light
+fs/file.c:760
+__fdget_pos
+fs/file.c:784
+SyS_read
+fs/read_write.c:562
+
+If a program needs to collect coverage from several threads (independently),
+it needs to open /sys/kernel/debug/kcov in each thread separately.
+
+The interface is fine-grained to allow efficient forking of test processes.
+That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode,
+mmaps coverage buffer and then forks child processes in a loop. Child processes
+only need to enable coverage (disable happens automatically on thread end).
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 3fd53e1..b44281c 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -56,6 +56,7 @@
BLACKFIN Blackfin architecture is enabled.
CLK Common clock infrastructure is enabled.
CMA Contiguous Memory Area support is enabled.
+ DM Device mapper support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
@@ -915,6 +916,11 @@
dis_ucode_ldr [X86] Disable the microcode loader.
+ dm= [DM] Allows early creation of a device-mapper device.
+ See Documentation/device-mapper/boot.txt.
+
+ dmasound= [HW,OSS] Sound subsystem buff
+
dma_debug=off If the kernel is compiled with DMA_API_DEBUG support,
this option disables the debugging code at boot.
@@ -1447,6 +1453,41 @@
In such case C2/C3 won't be used again.
idle=nomwait: Disable mwait for CPU C-states
+ ieee754= [MIPS] Select IEEE Std 754 conformance mode
+ Format: { strict | legacy | 2008 | relaxed }
+ Default: strict
+
+ Choose which programs will be accepted for execution
+ based on the IEEE 754 NaN encoding(s) supported by
+ the FPU and the NaN encoding requested with the value
+ of an ELF file header flag individually set by each
+ binary. Hardware implementations are permitted to
+ support either or both of the legacy and the 2008 NaN
+ encoding mode.
+
+ Available settings are as follows:
+ strict accept binaries that request a NaN encoding
+ supported by the FPU
+ legacy only accept legacy-NaN binaries, if supported
+ by the FPU
+ 2008 only accept 2008-NaN binaries, if supported
+ by the FPU
+ relaxed accept any binaries regardless of whether
+ supported by the FPU
+
+ The FPU emulator is always able to support both NaN
+ encodings, so if no FPU hardware is present or it has
+ been disabled with 'nofpu', then the settings of
+ 'legacy' and '2008' strap the emulator accordingly,
+ 'relaxed' straps the emulator for both legacy-NaN and
+ 2008-NaN, whereas 'strict' enables legacy-NaN only on
+ legacy processors and both NaN encodings on MIPS32 or
+ MIPS64 CPUs.
+
+ The setting for ABS.fmt/NEG.fmt instruction execution
+ mode generally follows that for the NaN encoding,
+ except where unsupported by hardware.
+
ignore_loglevel [KNL]
Ignore loglevel setting - this will print /all/
kernel messages to the console. Useful for debugging.
@@ -2432,6 +2473,25 @@
noexec=on: enable non-executable mappings (default)
noexec=off: disable non-executable mappings
+ noexec [MIPS]
+ Force indicating stack and heap as non-executable or
+ executable regardless of PT_GNU_STACK entry or CPU XI
+ (execute inhibit) support. Valid valuess are: on, off.
+ noexec=on: force indicating non-executable
+ stack and heap
+ noexec=off: force indicating executable
+ stack and heap
+ If this parameter is omitted, stack and heap will be
+ indicated non-executable or executable as they are
+ actually set up, which depends on PT_GNU_STACK entry
+ and possibly other factors (for instance, CPU XI
+ support).
+ NOTE: Using noexec=on on a system without CPU XI
+ support is not recommended since there is no actual
+ HW support that provide non-executable stack/heap.
+ Use only for debugging purposes and not in a
+ production environment.
+
nosmap [X86]
Disable SMAP (Supervisor Mode Access Prevention)
even if it is supported by processor.
@@ -3448,6 +3508,10 @@
ro [KNL] Mount root device read-only on boot
+ rodata= [KNL]
+ on Mark read-only kernel memory as read-only (default).
+ off Leave read-only kernel memory writable for debugging.
+
root= [KNL] Root filesystem
See name_to_dev_t comment in init/do_mounts.c.
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 2ea4c45..5f1ea84 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -584,6 +584,16 @@
See include/net/tcp.h and the code for more details.
+tcp_fwmark_accept - BOOLEAN
+ If set, incoming connections to listening sockets that do not have a
+ socket mark will set the mark of the accepting socket to the fwmark of
+ the incoming SYN packet. This will cause all packets on that connection
+ (starting from the first SYNACK) to be sent with that fwmark. The
+ listening socket's mark is unchanged. Listening sockets that already
+ have a fwmark set via setsockopt(SOL_SOCKET, SO_MARK, ...) are
+ unaffected.
+ Default: 0
+
tcp_syn_retries - INTEGER
Number of times initial SYNs for an active TCP connection attempt
will be retransmitted. Should not be higher than 255. Default value
@@ -1403,11 +1413,20 @@
Functional default: enabled if accept_ra is enabled.
disabled if accept_ra is disabled.
+accept_ra_rt_info_min_plen - INTEGER
+ Minimum prefix length of Route Information in RA.
+
+ Route Information w/ prefix smaller than this variable shall
+ be ignored.
+
+ Functional default: 0 if accept_ra_rtr_pref is enabled.
+ -1 if accept_ra_rtr_pref is disabled.
+
accept_ra_rt_info_max_plen - INTEGER
Maximum prefix length of Route Information in RA.
- Route Information w/ prefix larger than or equal to this
- variable shall be ignored.
+ Route Information w/ prefix larger than this variable shall
+ be ignored.
Functional default: 0 if accept_ra_rtr_pref is enabled.
-1 if accept_ra_rtr_pref is disabled.
diff --git a/Documentation/ramoops.txt b/Documentation/ramoops.txt
index 5d86756..9264bca 100644
--- a/Documentation/ramoops.txt
+++ b/Documentation/ramoops.txt
@@ -45,7 +45,7 @@
2. Setting the parameters
-Setting the ramoops parameters can be done in 2 different manners:
+Setting the ramoops parameters can be done in 3 different manners:
1. Use the module parameters (which have the names of the variables described
as before).
For quick debugging, you can also reserve parts of memory during boot
@@ -54,7 +54,9 @@
kernel to use only the first 128 MB of memory, and place ECC-protected ramoops
region at 128 MB boundary:
"mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1"
- 2. Use a platform device and set the platform data. The parameters can then
+ 2. Use Device Tree bindings, as described in
+ Documentation/device-tree/bindings/misc/ramoops.txt.
+ 3. Use a platform device and set the platform data. The parameters can then
be set through that platform data. An example of doing that is:
#include <linux/pstore_ram.h>
diff --git a/Documentation/scheduler/sched-energy.txt b/Documentation/scheduler/sched-energy.txt
new file mode 100644
index 0000000..dab2f90
--- /dev/null
+++ b/Documentation/scheduler/sched-energy.txt
@@ -0,0 +1,362 @@
+Energy cost model for energy-aware scheduling (EXPERIMENTAL)
+
+Introduction
+=============
+
+The basic energy model uses platform energy data stored in sched_group_energy
+data structures attached to the sched_groups in the sched_domain hierarchy. The
+energy cost model offers two functions that can be used to guide scheduling
+decisions:
+
+1. static unsigned int sched_group_energy(struct energy_env *eenv)
+2. static int energy_diff(struct energy_env *eenv)
+
+sched_group_energy() estimates the energy consumed by all cpus in a specific
+sched_group including any shared resources owned exclusively by this group of
+cpus. Resources shared with other cpus are excluded (e.g. later level caches).
+
+energy_diff() estimates the total energy impact of a utilization change. That
+is, adding, removing, or migrating utilization (tasks).
+
+Both functions use a struct energy_env to specify the scenario to be evaluated:
+
+ struct energy_env {
+ struct sched_group *sg_top;
+ struct sched_group *sg_cap;
+ int cap_idx;
+ int util_delta;
+ int src_cpu;
+ int dst_cpu;
+ int energy;
+ };
+
+sg_top: sched_group to be evaluated. Not used by energy_diff().
+
+sg_cap: sched_group covering the cpus in the same frequency domain. Set by
+sched_group_energy().
+
+cap_idx: Capacity state to be used for energy calculations. Set by
+find_new_capacity().
+
+util_delta: Amount of utilization to be added, removed, or migrated.
+
+src_cpu: Source cpu from where 'util_delta' utilization is removed. Should be
+-1 if no source (e.g. task wake-up).
+
+dst_cpu: Destination cpu where 'util_delta' utilization is added. Should be -1
+if utilization is removed (e.g. terminating tasks).
+
+energy: Result of sched_group_energy().
+
+The metric used to represent utilization is the actual per-entity running time
+averaged over time using a geometric series. Very similar to the existing
+per-entity load-tracking, but _not_ scaled by task priority and capped by the
+capacity of the cpu. The latter property does mean that utilization may
+underestimate the compute requirements for task on fully/over utilized cpus.
+The greatest potential for energy savings without affecting performance too much
+is scenarios where the system isn't fully utilized. If the system is deemed
+fully utilized load-balancing should be done with task load (includes task
+priority) instead in the interest of fairness and performance.
+
+
+Background and Terminology
+===========================
+
+To make it clear from the start:
+
+energy = [joule] (resource like a battery on powered devices)
+power = energy/time = [joule/second] = [watt]
+
+The goal of energy-aware scheduling is to minimize energy, while still getting
+the job done. That is, we want to maximize:
+
+ performance [inst/s]
+ --------------------
+ power [W]
+
+which is equivalent to minimizing:
+
+ energy [J]
+ -----------
+ instruction
+
+while still getting 'good' performance. It is essentially an alternative
+optimization objective to the current performance-only objective for the
+scheduler. This alternative considers two objectives: energy-efficiency and
+performance. Hence, there needs to be a user controllable knob to switch the
+objective. Since it is early days, this is currently a sched_feature
+(ENERGY_AWARE).
+
+The idea behind introducing an energy cost model is to allow the scheduler to
+evaluate the implications of its decisions rather than applying energy-saving
+techniques blindly that may only have positive effects on some platforms. At
+the same time, the energy cost model must be as simple as possible to minimize
+the scheduler latency impact.
+
+Platform topology
+------------------
+
+The system topology (cpus, caches, and NUMA information, not peripherals) is
+represented in the scheduler by the sched_domain hierarchy which has
+sched_groups attached at each level that covers one or more cpus (see
+sched-domains.txt for more details). To add energy awareness to the scheduler
+we need to consider power and frequency domains.
+
+Power domain:
+
+A power domain is a part of the system that can be powered on/off
+independently. Power domains are typically organized in a hierarchy where you
+may be able to power down just a cpu or a group of cpus along with any
+associated resources (e.g. shared caches). Powering up a cpu means that all
+power domains it is a part of in the hierarchy must be powered up. Hence, it is
+more expensive to power up the first cpu that belongs to a higher level power
+domain than powering up additional cpus in the same high level domain. Two
+level power domain hierarchy example:
+
+ Power source
+ +-------------------------------+----...
+per group PD G G
+ | +----------+ |
+ +--------+-------| Shared | (other groups)
+per-cpu PD G G | resource |
+ | | +----------+
+ +-------+ +-------+
+ | CPU 0 | | CPU 1 |
+ +-------+ +-------+
+
+Frequency domain:
+
+Frequency domains (P-states) typically cover the same group of cpus as one of
+the power domain levels. That is, there might be several smaller power domains
+sharing the same frequency (P-state) or there might be a power domain spanning
+multiple frequency domains.
+
+From a scheduling point of view there is no need to know the actual frequencies
+[Hz]. All the scheduler cares about is the compute capacity available at the
+current state (P-state) the cpu is in and any other available states. For that
+reason, and to also factor in any cpu micro-architecture differences, compute
+capacity scaling states are called 'capacity states' in this document. For SMP
+systems this is equivalent to P-states. For mixed micro-architecture systems
+(like ARM big.LITTLE) it is P-states scaled according to the micro-architecture
+performance relative to the other cpus in the system.
+
+Energy modelling:
+------------------
+
+Due to the hierarchical nature of the power domains, the most obvious way to
+model energy costs is therefore to associate power and energy costs with
+domains (groups of cpus). Energy costs of shared resources are associated with
+the group of cpus that share the resources, only the cost of powering the
+cpu itself and any private resources (e.g. private L1 caches) is associated
+with the per-cpu groups (lowest level).
+
+For example, for an SMP system with per-cpu power domains and a cluster level
+(group of cpus) power domain we get the overall energy costs to be:
+
+ energy = energy_cluster + n * energy_cpu
+
+where 'n' is the number of cpus powered up and energy_cluster is the cost paid
+as soon as any cpu in the cluster is powered up.
+
+The power and frequency domains can naturally be mapped onto the existing
+sched_domain hierarchy and sched_groups by adding the necessary data to the
+existing data structures.
+
+The energy model considers energy consumption from two contributors (shown in
+the illustration below):
+
+1. Busy energy: Energy consumed while a cpu and the higher level groups that it
+belongs to are busy running tasks. Busy energy is associated with the state of
+the cpu, not an event. The time the cpu spends in this state varies. Thus, the
+most obvious platform parameter for this contribution is busy power
+(energy/time).
+
+2. Idle energy: Energy consumed while a cpu and higher level groups that it
+belongs to are idle (in a C-state). Like busy energy, idle energy is associated
+with the state of the cpu. Thus, the platform parameter for this contribution
+is idle power (energy/time).
+
+Energy consumed during transitions from an idle-state (C-state) to a busy state
+(P-state) or going the other way is ignored by the model to simplify the energy
+model calculations.
+
+
+ Power
+ ^
+ | busy->idle idle->busy
+ | transition transition
+ |
+ | _ __
+ | / \ / \__________________
+ |______________/ \ /
+ | \ /
+ | Busy \ Idle / Busy
+ | low P-state \____________/ high P-state
+ |
+ +------------------------------------------------------------> time
+
+Busy |--------------| |-----------------|
+
+Wakeup |------| |------|
+
+Idle |------------|
+
+
+The basic algorithm
+====================
+
+The basic idea is to determine the total energy impact when utilization is
+added or removed by estimating the impact at each level in the sched_domain
+hierarchy starting from the bottom (sched_group contains just a single cpu).
+The energy cost comes from busy time (sched_group is awake because one or more
+cpus are busy) and idle time (in an idle-state). Energy model numbers account
+for energy costs associated with all cpus in the sched_group as a group.
+
+ for_each_domain(cpu, sd) {
+ sg = sched_group_of(cpu)
+ energy_before = curr_util(sg) * busy_power(sg)
+ + (1-curr_util(sg)) * idle_power(sg)
+ energy_after = new_util(sg) * busy_power(sg)
+ + (1-new_util(sg)) * idle_power(sg)
+ energy_diff += energy_before - energy_after
+
+ }
+
+ return energy_diff
+
+{curr, new}_util: The cpu utilization at the lowest level and the overall
+non-idle time for the entire group for higher levels. Utilization is in the
+range 0.0 to 1.0 in the pseudo-code.
+
+busy_power: The power consumption of the sched_group.
+
+idle_power: The power consumption of the sched_group when idle.
+
+Note: It is a fundamental assumption that the utilization is (roughly) scale
+invariant. Task utilization tracking factors in any frequency scaling and
+performance scaling differences due to difference cpu microarchitectures such
+that task utilization can be used across the entire system.
+
+
+Platform energy data
+=====================
+
+struct sched_group_energy can be attached to sched_groups in the sched_domain
+hierarchy and has the following members:
+
+cap_states:
+ List of struct capacity_state representing the supported capacity states
+ (P-states). struct capacity_state has two members: cap and power, which
+ represents the compute capacity and the busy_power of the state. The
+ list must be ordered by capacity low->high.
+
+nr_cap_states:
+ Number of capacity states in cap_states list.
+
+idle_states:
+ List of struct idle_state containing idle_state power cost for each
+ idle-state supported by the system orderd by shallowest state first.
+ All states must be included at all level in the hierarchy, i.e. a
+ sched_group spanning just a single cpu must also include coupled
+ idle-states (cluster states). In addition to the cpuidle idle-states,
+ the list must also contain an entry for the idling using the arch
+ default idle (arch_idle_cpu()). Despite this state may not be a true
+ hardware idle-state it is considered the shallowest idle-state in the
+ energy model and must be the first entry. cpus may enter this state
+ (possibly 'active idling') if cpuidle decides not enter a cpuidle
+ idle-state. Default idle may not be used when cpuidle is enabled.
+ In this case, it should just be a copy of the first cpuidle idle-state.
+
+nr_idle_states:
+ Number of idle states in idle_states list.
+
+There are no unit requirements for the energy cost data. Data can be normalized
+with any reference, however, the normalization must be consistent across all
+energy cost data. That is, one bogo-joule/watt must be the same quantity for
+data, but we don't care what it is.
+
+A recipe for platform characterization
+=======================================
+
+Obtaining the actual model data for a particular platform requires some way of
+measuring power/energy. There isn't a tool to help with this (yet). This
+section provides a recipe for use as reference. It covers the steps used to
+characterize the ARM TC2 development platform. This sort of measurements is
+expected to be done anyway when tuning cpuidle and cpufreq for a given
+platform.
+
+The energy model needs two types of data (struct sched_group_energy holds
+these) for each sched_group where energy costs should be taken into account:
+
+1. Capacity state information
+
+A list containing the compute capacity and power consumption when fully
+utilized attributed to the group as a whole for each available capacity state.
+At the lowest level (group contains just a single cpu) this is the power of the
+cpu alone without including power consumed by resources shared with other cpus.
+It basically needs to fit the basic modelling approach described in "Background
+and Terminology" section:
+
+ energy_system = energy_shared + n * energy_cpu
+
+for a system containing 'n' busy cpus. Only 'energy_cpu' should be included at
+the lowest level. 'energy_shared' is included at the next level which
+represents the group of cpus among which the resources are shared.
+
+This model is, of course, a simplification of reality. Thus, power/energy
+attributions might not always exactly represent how the hardware is designed.
+Also, busy power is likely to depend on the workload. It is therefore
+recommended to use a representative mix of workloads when characterizing the
+capacity states.
+
+If the group has no capacity scaling support, the list will contain a single
+state where power is the busy power attributed to the group. The capacity
+should be set to a default value (1024).
+
+When frequency domains include multiple power domains, the group representing
+the frequency domain and all child groups share capacity states. This must be
+indicated by setting the SD_SHARE_CAP_STATES sched_domain flag. All groups at
+all levels that share the capacity state must have the list of capacity states
+with the power set to the contribution of the individual group.
+
+2. Idle power information
+
+Stored in the idle_states list. The power number is the group idle power
+consumption in each idle state as well when the group is idle but has not
+entered an idle-state ('active idle' as mentioned earlier). Due to the way the
+energy model is defined, the idle power of the deepest group idle state can
+alternatively be accounted for in the parent group busy power. In that case the
+group idle state power values are offset such that the idle power of the
+deepest state is zero. It is less intuitive, but it is easier to measure as
+idle power consumed by the group and the busy/idle power of the parent group
+cannot be distinguished without per group measurement points.
+
+Measuring capacity states and idle power:
+
+The capacity states' capacity and power can be estimated by running a benchmark
+workload at each available capacity state. By restricting the benchmark to run
+on subsets of cpus it is possible to extrapolate the power consumption of
+shared resources.
+
+ARM TC2 has two clusters of two and three cpus respectively. Each cluster has a
+shared L2 cache. TC2 has on-chip energy counters per cluster. Running a
+benchmark workload on just one cpu in a cluster means that power is consumed in
+the cluster (higher level group) and a single cpu (lowest level group). Adding
+another benchmark task to another cpu increases the power consumption by the
+amount consumed by the additional cpu. Hence, it is possible to extrapolate the
+cluster busy power.
+
+For platforms that don't have energy counters or equivalent instrumentation
+built-in, it may be possible to use an external DAQ to acquire similar data.
+
+If the benchmark includes some performance score (for example sysbench cpu
+benchmark), this can be used to record the compute capacity.
+
+Measuring idle power requires insight into the idle state implementation on the
+particular platform. Specifically, if the platform has coupled idle-states (or
+package states). To measure non-coupled per-cpu idle-states it is necessary to
+keep one cpu busy to keep any shared resources alive to isolate the idle power
+of the cpu from idle/busy power of the shared resources. The cpu can be tricked
+into different per-cpu idle states by disabling the other states. Based on
+various combinations of measurements with specific cpus busy and disabling
+idle-states it is possible to extrapolate the idle-state power.
diff --git a/Documentation/scheduler/sched-tune.txt b/Documentation/scheduler/sched-tune.txt
new file mode 100644
index 0000000..9bd2231
--- /dev/null
+++ b/Documentation/scheduler/sched-tune.txt
@@ -0,0 +1,366 @@
+ Central, scheduler-driven, power-performance control
+ (EXPERIMENTAL)
+
+Abstract
+========
+
+The topic of a single simple power-performance tunable, that is wholly
+scheduler centric, and has well defined and predictable properties has come up
+on several occasions in the past [1,2]. With techniques such as a scheduler
+driven DVFS [3], we now have a good framework for implementing such a tunable.
+This document describes the overall ideas behind its design and implementation.
+
+
+Table of Contents
+=================
+
+1. Motivation
+2. Introduction
+3. Signal Boosting Strategy
+4. OPP selection using boosted CPU utilization
+5. Per task group boosting
+6. Question and Answers
+ - What about "auto" mode?
+ - What about boosting on a congested system?
+ - How CPUs are boosted when we have tasks with multiple boost values?
+7. References
+
+
+1. Motivation
+=============
+
+Sched-DVFS [3] is a new event-driven cpufreq governor which allows the
+scheduler to select the optimal DVFS operating point (OPP) for running a task
+allocated to a CPU. The introduction of sched-DVFS enables running workloads at
+the most energy efficient OPPs.
+
+However, sometimes it may be desired to intentionally boost the performance of
+a workload even if that could imply a reasonable increase in energy
+consumption. For example, in order to reduce the response time of a task, we
+may want to run the task at a higher OPP than the one that is actually required
+by it's CPU bandwidth demand.
+
+This last requirement is especially important if we consider that one of the
+main goals of the sched-DVFS component is to replace all currently available
+CPUFreq policies. Since sched-DVFS is event based, as opposed to the sampling
+driven governors we currently have, it is already more responsive at selecting
+the optimal OPP to run tasks allocated to a CPU. However, just tracking the
+actual task load demand may not be enough from a performance standpoint. For
+example, it is not possible to get behaviors similar to those provided by the
+"performance" and "interactive" CPUFreq governors.
+
+This document describes an implementation of a tunable, stacked on top of the
+sched-DVFS which extends its functionality to support task performance
+boosting.
+
+By "performance boosting" we mean the reduction of the time required to
+complete a task activation, i.e. the time elapsed from a task wakeup to its
+next deactivation (e.g. because it goes back to sleep or it terminates). For
+example, if we consider a simple periodic task which executes the same workload
+for 5[s] every 20[s] while running at a certain OPP, a boosted execution of
+that task must complete each of its activations in less than 5[s].
+
+A previous attempt [5] to introduce such a boosting feature has not been
+successful mainly because of the complexity of the proposed solution. The
+approach described in this document exposes a single simple interface to
+user-space. This single tunable knob allows the tuning of system wide
+scheduler behaviours ranging from energy efficiency at one end through to
+incremental performance boosting at the other end. This first tunable affects
+all tasks. However, a more advanced extension of the concept is also provided
+which uses CGroups to boost the performance of only selected tasks while using
+the energy efficient default for all others.
+
+The rest of this document introduces in more details the proposed solution
+which has been named SchedTune.
+
+
+2. Introduction
+===============
+
+SchedTune exposes a simple user-space interface with a single power-performance
+tunable:
+
+ /proc/sys/kernel/sched_cfs_boost
+
+This permits expressing a boost value as an integer in the range [0..100].
+
+A value of 0 (default) configures the CFS scheduler for maximum energy
+efficiency. This means that sched-DVFS runs the tasks at the minimum OPP
+required to satisfy their workload demand.
+A value of 100 configures scheduler for maximum performance, which translates
+to the selection of the maximum OPP on that CPU.
+
+The range between 0 and 100 can be set to satisfy other scenarios suitably. For
+example to satisfy interactive response or depending on other system events
+(battery level etc).
+
+A CGroup based extension is also provided, which permits further user-space
+defined task classification to tune the scheduler for different goals depending
+on the specific nature of the task, e.g. background vs interactive vs
+low-priority.
+
+The overall design of the SchedTune module is built on top of "Per-Entity Load
+Tracking" (PELT) signals and sched-DVFS by introducing a bias on the Operating
+Performance Point (OPP) selection.
+Each time a task is allocated on a CPU, sched-DVFS has the opportunity to tune
+the operating frequency of that CPU to better match the workload demand. The
+selection of the actual OPP being activated is influenced by the global boost
+value, or the boost value for the task CGroup when in use.
+
+This simple biasing approach leverages existing frameworks, which means minimal
+modifications to the scheduler, and yet it allows to achieve a range of
+different behaviours all from a single simple tunable knob.
+The only new concept introduced is that of signal boosting.
+
+
+3. Signal Boosting Strategy
+===========================
+
+The whole PELT machinery works based on the value of a few load tracking signals
+which basically track the CPU bandwidth requirements for tasks and the capacity
+of CPUs. The basic idea behind the SchedTune knob is to artificially inflate
+some of these load tracking signals to make a task or RQ appears more demanding
+that it actually is.
+
+Which signals have to be inflated depends on the specific "consumer". However,
+independently from the specific (signal, consumer) pair, it is important to
+define a simple and possibly consistent strategy for the concept of boosting a
+signal.
+
+A boosting strategy defines how the "abstract" user-space defined
+sched_cfs_boost value is translated into an internal "margin" value to be added
+to a signal to get its inflated value:
+
+ margin := boosting_strategy(sched_cfs_boost, signal)
+ boosted_signal := signal + margin
+
+Different boosting strategies were identified and analyzed before selecting the
+one found to be most effective.
+
+Signal Proportional Compensation (SPC)
+--------------------------------------
+
+In this boosting strategy the sched_cfs_boost value is used to compute a
+margin which is proportional to the complement of the original signal.
+When a signal has a maximum possible value, its complement is defined as
+the delta from the actual value and its possible maximum.
+
+Since the tunable implementation uses signals which have SCHED_LOAD_SCALE as
+the maximum possible value, the margin becomes:
+
+ margin := sched_cfs_boost * (SCHED_LOAD_SCALE - signal)
+
+Using this boosting strategy:
+- a 100% sched_cfs_boost means that the signal is scaled to the maximum value
+- each value in the range of sched_cfs_boost effectively inflates the signal in
+ question by a quantity which is proportional to the maximum value.
+
+For example, by applying the SPC boosting strategy to the selection of the OPP
+to run a task it is possible to achieve these behaviors:
+
+- 0% boosting: run the task at the minimum OPP required by its workload
+- 100% boosting: run the task at the maximum OPP available for the CPU
+- 50% boosting: run at the half-way OPP between minimum and maximum
+
+Which means that, at 50% boosting, a task will be scheduled to run at half of
+the maximum theoretically achievable performance on the specific target
+platform.
+
+A graphical representation of an SPC boosted signal is represented in the
+following figure where:
+ a) "-" represents the original signal
+ b) "b" represents a 50% boosted signal
+ c) "p" represents a 100% boosted signal
+
+
+ ^
+ | SCHED_LOAD_SCALE
+ +-----------------------------------------------------------------+
+ |pppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp
+ |
+ | boosted_signal
+ | bbbbbbbbbbbbbbbbbbbbbbbb
+ |
+ | original signal
+ | bbbbbbbbbbbbbbbbbbbbbbbb+----------------------+
+ | |
+ |bbbbbbbbbbbbbbbbbb |
+ | |
+ | |
+ | |
+ | +-----------------------+
+ | |
+ | |
+ | |
+ |------------------+
+ |
+ |
+ +----------------------------------------------------------------------->
+
+The plot above shows a ramped load signal (titled 'original_signal') and it's
+boosted equivalent. For each step of the original signal the boosted signal
+corresponding to a 50% boost is midway from the original signal and the upper
+bound. Boosting by 100% generates a boosted signal which is always saturated to
+the upper bound.
+
+
+4. OPP selection using boosted CPU utilization
+==============================================
+
+It is worth calling out that the implementation does not introduce any new load
+signals. Instead, it provides an API to tune existing signals. This tuning is
+done on demand and only in scheduler code paths where it is sensible to do so.
+The new API calls are defined to return either the default signal or a boosted
+one, depending on the value of sched_cfs_boost. This is a clean an non invasive
+modification of the existing existing code paths.
+
+The signal representing a CPU's utilization is boosted according to the
+previously described SPC boosting strategy. To sched-DVFS, this allows a CPU
+(ie CFS run-queue) to appear more used then it actually is.
+
+Thus, with the sched_cfs_boost enabled we have the following main functions to
+get the current utilization of a CPU:
+
+ cpu_util()
+ boosted_cpu_util()
+
+The new boosted_cpu_util() is similar to the first but returns a boosted
+utilization signal which is a function of the sched_cfs_boost value.
+
+This function is used in the CFS scheduler code paths where sched-DVFS needs to
+decide the OPP to run a CPU at.
+For example, this allows selecting the highest OPP for a CPU which has
+the boost value set to 100%.
+
+
+5. Per task group boosting
+==========================
+
+The availability of a single knob which is used to boost all tasks in the
+system is certainly a simple solution but it quite likely doesn't fit many
+utilization scenarios, especially in the mobile device space.
+
+For example, on battery powered devices there usually are many background
+services which are long running and need energy efficient scheduling. On the
+other hand, some applications are more performance sensitive and require an
+interactive response and/or maximum performance, regardless of the energy cost.
+To better service such scenarios, the SchedTune implementation has an extension
+that provides a more fine grained boosting interface.
+
+A new CGroup controller, namely "schedtune", could be enabled which allows to
+defined and configure task groups with different boosting values.
+Tasks that require special performance can be put into separate CGroups.
+The value of the boost associated with the tasks in this group can be specified
+using a single knob exposed by the CGroup controller:
+
+ schedtune.boost
+
+This knob allows the definition of a boost value that is to be used for
+SPC boosting of all tasks attached to this group.
+
+The current schedtune controller implementation is really simple and has these
+main characteristics:
+
+ 1) It is only possible to create 1 level depth hierarchies
+
+ The root control groups define the system-wide boost value to be applied
+ by default to all tasks. Its direct subgroups are named "boost groups" and
+ they define the boost value for specific set of tasks.
+ Further nested subgroups are not allowed since they do not have a sensible
+ meaning from a user-space standpoint.
+
+ 2) It is possible to define only a limited number of "boost groups"
+
+ This number is defined at compile time and by default configured to 16.
+ This is a design decision motivated by two main reasons:
+ a) In a real system we do not expect utilization scenarios with more then few
+ boost groups. For example, a reasonable collection of groups could be
+ just "background", "interactive" and "performance".
+ b) It simplifies the implementation considerably, especially for the code
+ which has to compute the per CPU boosting once there are multiple
+ RUNNABLE tasks with different boost values.
+
+Such a simple design should allow servicing the main utilization scenarios identified
+so far. It provides a simple interface which can be used to manage the
+power-performance of all tasks or only selected tasks.
+Moreover, this interface can be easily integrated by user-space run-times (e.g.
+Android, ChromeOS) to implement a QoS solution for task boosting based on tasks
+classification, which has been a long standing requirement.
+
+Setup and usage
+---------------
+
+0. Use a kernel with CGROUP_SCHEDTUNE support enabled
+
+1. Check that the "schedtune" CGroup controller is available:
+
+ root@linaro-nano:~# cat /proc/cgroups
+ #subsys_name hierarchy num_cgroups enabled
+ cpuset 0 1 1
+ cpu 0 1 1
+ schedtune 0 1 1
+
+2. Mount a tmpfs to create the CGroups mount point (Optional)
+
+ root@linaro-nano:~# sudo mount -t tmpfs cgroups /sys/fs/cgroup
+
+3. Mount the "schedtune" controller
+
+ root@linaro-nano:~# mkdir /sys/fs/cgroup/stune
+ root@linaro-nano:~# sudo mount -t cgroup -o schedtune stune /sys/fs/cgroup/stune
+
+4. Setup the system-wide boost value (Optional)
+
+ If not configured the root control group has a 0% boost value, which
+ basically disables boosting for all tasks in the system thus running in
+ an energy-efficient mode.
+
+ root@linaro-nano:~# echo $SYSBOOST > /sys/fs/cgroup/stune/schedtune.boost
+
+5. Create task groups and configure their specific boost value (Optional)
+
+ For example here we create a "performance" boost group configure to boost
+ all its tasks to 100%
+
+ root@linaro-nano:~# mkdir /sys/fs/cgroup/stune/performance
+ root@linaro-nano:~# echo 100 > /sys/fs/cgroup/stune/performance/schedtune.boost
+
+6. Move tasks into the boost group
+
+ For example, the following moves the tasks with PID $TASKPID (and all its
+ threads) into the "performance" boost group.
+
+ root@linaro-nano:~# echo "TASKPID > /sys/fs/cgroup/stune/performance/cgroup.procs
+
+This simple configuration allows only the threads of the $TASKPID task to run,
+when needed, at the highest OPP in the most capable CPU of the system.
+
+
+6. Question and Answers
+=======================
+
+What about "auto" mode?
+-----------------------
+
+The 'auto' mode as described in [5] can be implemented by interfacing SchedTune
+with some suitable user-space element. This element could use the exposed
+system-wide or cgroup based interface.
+
+How are multiple groups of tasks with different boost values managed?
+---------------------------------------------------------------------
+
+The current SchedTune implementation keeps track of the boosted RUNNABLE tasks
+on a CPU. Once sched-DVFS selects the OPP to run a CPU at, the CPU utilization
+is boosted with a value which is the maximum of the boost values of the
+currently RUNNABLE tasks in its RQ.
+
+This allows sched-DVFS to boost a CPU only while there are boosted tasks ready
+to run and switch back to the energy efficient mode as soon as the last boosted
+task is dequeued.
+
+
+7. References
+=============
+[1] http://lwn.net/Articles/552889
+[2] http://lkml.org/lkml/2012/5/18/91
+[3] http://lkml.org/lkml/2015/6/26/620
diff --git a/Documentation/sync.txt b/Documentation/sync.txt
new file mode 100644
index 0000000..a2d05e7
--- /dev/null
+++ b/Documentation/sync.txt
@@ -0,0 +1,75 @@
+Motivation:
+
+In complicated DMA pipelines such as graphics (multimedia, camera, gpu, display)
+a consumer of a buffer needs to know when the producer has finished producing
+it. Likewise the producer needs to know when the consumer is finished with the
+buffer so it can reuse it. A particular buffer may be consumed by multiple
+consumers which will retain the buffer for different amounts of time. In
+addition, a consumer may consume multiple buffers atomically.
+The sync framework adds an API which allows synchronization between the
+producers and consumers in a generic way while also allowing platforms which
+have shared hardware synchronization primitives to exploit them.
+
+Goals:
+ * provide a generic API for expressing synchronization dependencies
+ * allow drivers to exploit hardware synchronization between hardware
+ blocks
+ * provide a userspace API that allows a compositor to manage
+ dependencies.
+ * provide rich telemetry data to allow debugging slowdowns and stalls of
+ the graphics pipeline.
+
+Objects:
+ * sync_timeline
+ * sync_pt
+ * sync_fence
+
+sync_timeline:
+
+A sync_timeline is an abstract monotonically increasing counter. In general,
+each driver/hardware block context will have one of these. They can be backed
+by the appropriate hardware or rely on the generic sw_sync implementation.
+Timelines are only ever created through their specific implementations
+(i.e. sw_sync.)
+
+sync_pt:
+
+A sync_pt is an abstract value which marks a point on a sync_timeline. Sync_pts
+have a single timeline parent. They have 3 states: active, signaled, and error.
+They start in active state and transition, once, to either signaled (when the
+timeline counter advances beyond the sync_pt’s value) or error state.
+
+sync_fence:
+
+Sync_fences are the primary primitives used by drivers to coordinate
+synchronization of their buffers. They are a collection of sync_pts which may
+or may not have the same timeline parent. A sync_pt can only exist in one fence
+and the fence's list of sync_pts is immutable once created. Fences can be
+waited on synchronously or asynchronously. Two fences can also be merged to
+create a third fence containing a copy of the two fences’ sync_pts. Fences are
+backed by file descriptors to allow userspace to coordinate the display pipeline
+dependencies.
+
+Use:
+
+A driver implementing sync support should have a work submission function which:
+ * takes a fence argument specifying when to begin work
+ * asynchronously queues that work to kick off when the fence is signaled
+ * returns a fence to indicate when its work will be done.
+ * signals the returned fence once the work is completed.
+
+Consider an imaginary display driver that has the following API:
+/*
+ * assumes buf is ready to be displayed.
+ * blocks until the buffer is on screen.
+ */
+ void display_buffer(struct dma_buf *buf);
+
+The new API will become:
+/*
+ * will display buf when fence is signaled.
+ * returns immediately with a fence that will signal when buf
+ * is no longer displayed.
+ */
+struct sync_fence* display_buffer(struct dma_buf *buf,
+ struct sync_fence *fence);
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index be61d53..e7018fc 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -58,6 +58,8 @@
- panic_on_stackoverflow
- panic_on_unrecovered_nmi
- panic_on_warn
+- perf_cpu_time_max_percent
+- perf_event_paranoid
- pid_max
- powersave-nap [ PPC only ]
- printk
@@ -624,6 +626,19 @@
==============================================================
+perf_event_paranoid:
+
+Controls use of the performance events system by unprivileged
+users (without CAP_SYS_ADMIN). The default value is 3 if
+CONFIG_SECURITY_PERF_EVENTS_RESTRICT is set, or 1 otherwise.
+
+ -1: Allow use of (almost) all events by all users
+>=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
+>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
+>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
+>=3: Disallow all event access by users without CAP_SYS_ADMIN
+
+==============================================================
pid_max:
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index f72370b..c0397af 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -30,6 +30,7 @@
- dirty_writeback_centisecs
- drop_caches
- extfrag_threshold
+- extra_free_kbytes
- hugepages_treat_as_movable
- hugetlb_shm_group
- laptop_mode
@@ -42,6 +43,8 @@
- min_slab_ratio
- min_unmapped_ratio
- mmap_min_addr
+- mmap_rnd_bits
+- mmap_rnd_compat_bits
- nr_hugepages
- nr_overcommit_hugepages
- nr_trim_pages (only if CONFIG_MMU=n)
@@ -236,6 +239,21 @@
==============================================================
+extra_free_kbytes
+
+This parameter tells the VM to keep extra free memory between the threshold
+where background reclaim (kswapd) kicks in, and the threshold where direct
+reclaim (by allocating processes) kicks in.
+
+This is useful for workloads that require low latency memory allocations
+and have a bounded burstiness in memory allocations, for example a
+realtime application that receives and transmits network traffic
+(causing in-kernel memory allocations) with a maximum total message burst
+size of 200MB may need 200MB of extra free memory to avoid direct reclaim
+related latencies.
+
+==============================================================
+
hugepages_treat_as_movable
This parameter controls whether we can allocate hugepages from ZONE_MOVABLE
@@ -485,6 +503,33 @@
==============================================================
+mmap_rnd_bits:
+
+This value can be used to select the number of bits to use to
+determine the random offset to the base address of vma regions
+resulting from mmap allocations on architectures which support
+tuning address space randomization. This value will be bounded
+by the architecture's minimum and maximum supported values.
+
+This value can be changed after boot using the
+/proc/sys/vm/mmap_rnd_bits tunable
+
+==============================================================
+
+mmap_rnd_compat_bits:
+
+This value can be used to select the number of bits to use to
+determine the random offset to the base address of vma regions
+resulting from mmap allocations for applications run in
+compatibility mode on architectures which support tuning address
+space randomization. This value will be bounded by the
+architecture's minimum and maximum supported values.
+
+This value can be changed after boot using the
+/proc/sys/vm/mmap_rnd_compat_bits tunable
+
+==============================================================
+
nr_hugepages
Change the minimum size of the hugepage pool.
diff --git a/Documentation/tee.txt b/Documentation/tee.txt
new file mode 100644
index 0000000..56ea85f
--- /dev/null
+++ b/Documentation/tee.txt
@@ -0,0 +1,127 @@
+=============
+TEE subsystem
+=============
+
+This document describes the TEE subsystem in Linux.
+
+A TEE (Trusted Execution Environment) is a trusted OS running in some
+secure environment, for example, TrustZone on ARM CPUs, or a separate
+secure co-processor etc. A TEE driver handles the details needed to
+communicate with the TEE.
+
+This subsystem deals with:
+
+- Registration of TEE drivers
+
+- Managing shared memory between Linux and the TEE
+
+- Providing a generic API to the TEE
+
+The TEE interface
+=================
+
+include/uapi/linux/tee.h defines the generic interface to a TEE.
+
+User space (the client) connects to the driver by opening /dev/tee[0-9]* or
+/dev/teepriv[0-9]*.
+
+- TEE_IOC_SHM_ALLOC allocates shared memory and returns a file descriptor
+ which user space can mmap. When user space doesn't need the file
+ descriptor any more, it should be closed. When shared memory isn't needed
+ any longer it should be unmapped with munmap() to allow the reuse of
+ memory.
+
+- TEE_IOC_VERSION lets user space know which TEE this driver handles and
+ the its capabilities.
+
+- TEE_IOC_OPEN_SESSION opens a new session to a Trusted Application.
+
+- TEE_IOC_INVOKE invokes a function in a Trusted Application.
+
+- TEE_IOC_CANCEL may cancel an ongoing TEE_IOC_OPEN_SESSION or TEE_IOC_INVOKE.
+
+- TEE_IOC_CLOSE_SESSION closes a session to a Trusted Application.
+
+There are two classes of clients, normal clients and supplicants. The latter is
+a helper process for the TEE to access resources in Linux, for example file
+system access. A normal client opens /dev/tee[0-9]* and a supplicant opens
+/dev/teepriv[0-9].
+
+Much of the communication between clients and the TEE is opaque to the
+driver. The main job for the driver is to receive requests from the
+clients, forward them to the TEE and send back the results. In the case of
+supplicants the communication goes in the other direction, the TEE sends
+requests to the supplicant which then sends back the result.
+
+OP-TEE driver
+=============
+
+The OP-TEE driver handles OP-TEE [1] based TEEs. Currently it is only the ARM
+TrustZone based OP-TEE solution that is supported.
+
+Lowest level of communication with OP-TEE builds on ARM SMC Calling
+Convention (SMCCC) [2], which is the foundation for OP-TEE's SMC interface
+[3] used internally by the driver. Stacked on top of that is OP-TEE Message
+Protocol [4].
+
+OP-TEE SMC interface provides the basic functions required by SMCCC and some
+additional functions specific for OP-TEE. The most interesting functions are:
+
+- OPTEE_SMC_FUNCID_CALLS_UID (part of SMCCC) returns the version information
+ which is then returned by TEE_IOC_VERSION
+
+- OPTEE_SMC_CALL_GET_OS_UUID returns the particular OP-TEE implementation, used
+ to tell, for instance, a TrustZone OP-TEE apart from an OP-TEE running on a
+ separate secure co-processor.
+
+- OPTEE_SMC_CALL_WITH_ARG drives the OP-TEE message protocol
+
+- OPTEE_SMC_GET_SHM_CONFIG lets the driver and OP-TEE agree on which memory
+ range to used for shared memory between Linux and OP-TEE.
+
+The GlobalPlatform TEE Client API [5] is implemented on top of the generic
+TEE API.
+
+Picture of the relationship between the different components in the
+OP-TEE architecture::
+
+ User space Kernel Secure world
+ ~~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~~
+ +--------+ +-------------+
+ | Client | | Trusted |
+ +--------+ | Application |
+ /\ +-------------+
+ || +----------+ /\
+ || |tee- | ||
+ || |supplicant| \/
+ || +----------+ +-------------+
+ \/ /\ | TEE Internal|
+ +-------+ || | API |
+ + TEE | || +--------+--------+ +-------------+
+ | Client| || | TEE | OP-TEE | | OP-TEE |
+ | API | \/ | subsys | driver | | Trusted OS |
+ +-------+----------------+----+-------+----+-----------+-------------+
+ | Generic TEE API | | OP-TEE MSG |
+ | IOCTL (TEE_IOC_*) | | SMCCC (OPTEE_SMC_CALL_*) |
+ +-----------------------------+ +------------------------------+
+
+RPC (Remote Procedure Call) are requests from secure world to kernel driver
+or tee-supplicant. An RPC is identified by a special range of SMCCC return
+values from OPTEE_SMC_CALL_WITH_ARG. RPC messages which are intended for the
+kernel are handled by the kernel driver. Other RPC messages will be forwarded to
+tee-supplicant without further involvement of the driver, except switching
+shared memory buffer representation.
+
+References
+==========
+
+[1] https://github.com/OP-TEE/optee_os
+
+[2] http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
+
+[3] drivers/tee/optee/optee_smc.h
+
+[4] drivers/tee/optee/optee_msg.h
+
+[5] http://www.globalplatform.org/specificationsdevice.asp look for
+ "TEE Client API Specification v1.0" and click download.
diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt
index 21d514ce..4d817d5 100644
--- a/Documentation/trace/events-power.txt
+++ b/Documentation/trace/events-power.txt
@@ -25,6 +25,7 @@
cpu_idle "state=%lu cpu_id=%lu"
cpu_frequency "state=%lu cpu_id=%lu"
+cpu_frequency_limits "min=%lu max=%lu cpu_id=%lu"
A suspend event is used to indicate the system going in and out of the
suspend mode:
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index f52f297..fa16fb2 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -357,6 +357,26 @@
to correlate events across hypervisor/guest if
tb_offset is known.
+ mono: This uses the fast monotonic clock (CLOCK_MONOTONIC)
+ which is monotonic and is subject to NTP rate adjustments.
+
+ mono_raw:
+ This is the raw monotonic clock (CLOCK_MONOTONIC_RAW)
+ which is montonic but is not subject to any rate adjustments
+ and ticks at the same rate as the hardware clocksource.
+
+ boot: This is the boot clock (CLOCK_BOOTTIME) and is based on the
+ fast monotonic clock, but also accounts for time spent in
+ suspend. Since the clock access is designed for use in
+ tracing in the suspend path, some side effects are possible
+ if clock is accessed after the suspend time is accounted before
+ the fast mono clock is updated. In this case, the clock update
+ appears to happen slightly sooner than it normally would have.
+ Also on 32-bit systems, its possible that the 64-bit boot offset
+ sees a partial update. These effects are rare and post
+ processing should be able to handle them. See comments on
+ ktime_get_boot_fast_ns function for more information.
+
To set a clock, simply echo the clock name into this file.
echo global > trace_clock
@@ -2088,6 +2108,35 @@
1) 1.449 us | }
+You can disable the hierarchical function call formatting and instead print a
+flat list of function entry and return events. This uses the format described
+in the Output Formatting section and respects all the trace options that
+control that formatting. Hierarchical formatting is the default.
+
+ hierachical: echo nofuncgraph-flat > trace_options
+ flat: echo funcgraph-flat > trace_options
+
+ ie:
+
+ # tracer: function_graph
+ #
+ # entries-in-buffer/entries-written: 68355/68355 #P:2
+ #
+ # _-----=> irqs-off
+ # / _----=> need-resched
+ # | / _---=> hardirq/softirq
+ # || / _--=> preempt-depth
+ # ||| / delay
+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
+ # | | | |||| | |
+ sh-1806 [001] d... 198.843443: graph_ent: func=_raw_spin_lock
+ sh-1806 [001] d... 198.843445: graph_ent: func=__raw_spin_lock
+ sh-1806 [001] d..1 198.843447: graph_ret: func=__raw_spin_lock
+ sh-1806 [001] d..1 198.843449: graph_ret: func=_raw_spin_lock
+ sh-1806 [001] d..1 198.843451: graph_ent: func=_raw_spin_unlock_irqrestore
+ sh-1806 [001] d... 198.843453: graph_ret: func=_raw_spin_unlock_irqrestore
+
+
You might find other useful features for this tracer in the
following "dynamic ftrace" section such as tracing only specific
functions or tasks.
diff --git a/Kbuild b/Kbuild
index f55cefd..f56ed56 100644
--- a/Kbuild
+++ b/Kbuild
@@ -6,31 +6,6 @@
# 3) Generate asm-offsets.h (may need bounds.h and timeconst.h)
# 4) Check for missing system calls
-# Default sed regexp - multiline due to syntax constraints
-define sed-y
- "/^->/{s:->#\(.*\):/* \1 */:; \
- s:^->\([^ ]*\) [\$$#]*\([-0-9]*\) \(.*\):#define \1 \2 /* \3 */:; \
- s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; \
- s:->::; p;}"
-endef
-
-# Use filechk to avoid rebuilds when a header changes, but the resulting file
-# does not
-define filechk_offsets
- (set -e; \
- echo "#ifndef $2"; \
- echo "#define $2"; \
- echo "/*"; \
- echo " * DO NOT MODIFY."; \
- echo " *"; \
- echo " * This file was generated by Kbuild"; \
- echo " */"; \
- echo ""; \
- sed -ne $(sed-y); \
- echo ""; \
- echo "#endif" )
-endef
-
#####
# 1) Generate bounds.h
diff --git a/MAINTAINERS b/MAINTAINERS
index ab65bbe..8368ad5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -746,6 +746,11 @@
S: Supported
F: drivers/dma/dma-axi-dmac.c
+ANDROID CONFIG FRAGMENTS
+M: Rob Herring <robh@kernel.org>
+S: Supported
+F: kernel/configs/android*
+
ANDROID DRIVERS
M: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
M: Arve Hjønnevåg <arve@android.com>
@@ -756,6 +761,18 @@
F: drivers/android/
F: drivers/staging/android/
+ANDROID GOLDFISH PIC DRIVER
+M: Miodrag Dinic <miodrag.dinic@mips.com>
+S: Supported
+F: Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
+F: drivers/irqchip/irq-goldfish-pic.c
+
+ANDROID GOLDFISH RTC DRIVER
+M: Miodrag Dinic <miodrag.dinic@mips.com>
+S: Supported
+F: Documentation/devicetree/bindings/rtc/google,goldfish-rtc.txt
+F: drivers/rtc/rtc-goldfish.c
+
AOA (Apple Onboard Audio) ALSA DRIVER
M: Johannes Berg <johannes@sipsolutions.net>
L: linuxppc-dev@lists.ozlabs.org
@@ -5982,6 +5999,20 @@
F: Documentation/hwmon/k8temp
F: drivers/hwmon/k8temp.c
+KASAN
+M: Andrey Ryabinin <aryabinin@virtuozzo.com>
+R: Alexander Potapenko <glider@google.com>
+R: Dmitry Vyukov <dvyukov@google.com>
+L: kasan-dev@googlegroups.com
+S: Maintained
+F: arch/*/include/asm/kasan.h
+F: arch/*/mm/kasan_init*
+F: Documentation/kasan.txt
+F: include/linux/kasan*.h
+F: lib/test_kasan.c
+F: mm/kasan/
+F: scripts/Makefile.kasan
+
KCONFIG
M: "Yann E. MORIN" <yann.morin.1998@free.fr>
L: linux-kbuild@vger.kernel.org
@@ -7040,6 +7071,20 @@
F: Documentation/mips/
F: arch/mips/
+MIPS GENERIC PLATFORM
+M: Paul Burton <paul.burton@imgtec.com>
+L: linux-mips@linux-mips.org
+S: Supported
+F: Documentation/devicetree/bindings/power/mti,mips-cpc.txt
+F: arch/mips/generic/
+
+MIPS RINT INSTRUCTION EMULATION
+M: Aleksandar Markovic <aleksandar.markovic@mips.com>
+L: linux-mips@linux-mips.org
+S: Supported
+F: arch/mips/math-emu/sp_rint.c
+F: arch/mips/math-emu/dp_rint.c
+
MIROSOUND PCM20 FM RADIO RECEIVER DRIVER
M: Hans Verkuil <hverkuil@xs4all.nl>
L: linux-media@vger.kernel.org
@@ -7935,6 +7980,11 @@
F: drivers/oprofile/
F: include/linux/oprofile.h
+OP-TEE DRIVER
+M: Jens Wiklander <jens.wiklander@linaro.org>
+S: Maintained
+F: drivers/tee/optee/
+
ORACLE CLUSTER FILESYSTEM 2 (OCFS2)
M: Mark Fasheh <mfasheh@suse.com>
M: Joel Becker <jlbec@evilplan.org>
@@ -8837,6 +8887,13 @@
F: Documentation/blockdev/ramdisk.txt
F: drivers/block/brd.c
+RANCHU VIRTUAL BOARD FOR MIPS
+M: Miodrag Dinic <miodrag.dinic@mips.com>
+L: linux-mips@linux-mips.org
+S: Supported
+F: arch/mips/generic/board-ranchu.c
+F: arch/mips/configs/generic/board-ranchu.config
+
RANDOM NUMBER DRIVER
M: "Theodore Ts'o" <tytso@mit.edu>
S: Maintained
@@ -9361,6 +9418,14 @@
F: include/linux/stm.h
F: include/uapi/linux/stm.h
+TEE SUBSYSTEM
+M: Jens Wiklander <jens.wiklander@linaro.org>
+S: Maintained
+F: include/linux/tee_drv.h
+F: include/uapi/linux/tee.h
+F: drivers/tee/
+F: Documentation/tee.txt
+
THUNDERBOLT DRIVER
M: Andreas Noever <andreas.noever@gmail.com>
S: Maintained
diff --git a/Makefile b/Makefile
index be31491..73cc5df 100644
--- a/Makefile
+++ b/Makefile
@@ -148,7 +148,7 @@
$(filter-out _all sub-make $(CURDIR)/Makefile, $(MAKECMDGOALS)) _all: sub-make
@:
-sub-make: FORCE
+sub-make:
$(Q)$(MAKE) -C $(KBUILD_OUTPUT) KBUILD_SRC=$(CURDIR) \
-f $(CURDIR)/Makefile $(filter-out _all sub-make,$(MAKECMDGOALS))
@@ -303,7 +303,7 @@
HOSTCC = gcc
HOSTCXX = g++
-HOSTCFLAGS = -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 -fomit-frame-pointer -std=gnu89
+HOSTCFLAGS := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 -fomit-frame-pointer -std=gnu89
HOSTCXXFLAGS = -O2
ifeq ($(shell $(HOSTCC) -v 2>&1 | grep -c "clang version"), 1)
@@ -371,6 +371,7 @@
CFLAGS_KERNEL =
AFLAGS_KERNEL =
CFLAGS_GCOV = -fprofile-arcs -ftest-coverage -fno-tree-loop-im
+CFLAGS_KCOV = -fsanitize-coverage=trace-pc
# Use USERINCLUDE when you must reference the UAPI directories only.
@@ -418,7 +419,7 @@
export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
-export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KASAN
+export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KCOV CFLAGS_KASAN
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
@@ -627,7 +628,7 @@
KBUILD_CFLAGS += $(call cc-disable-warning, attribute-alias)
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
-KBUILD_CFLAGS += -Os
+KBUILD_CFLAGS += $(call cc-option,-Oz,-Os)
else
ifdef CONFIG_PROFILE_ALL_BRANCHES
KBUILD_CFLAGS += -O2
@@ -696,12 +697,31 @@
endif
KBUILD_CFLAGS += $(stackp-flag)
+ifdef CONFIG_KCOV
+ ifeq ($(call cc-option, $(CFLAGS_KCOV)),)
+ $(warning Cannot use CONFIG_KCOV: \
+ -fsanitize-coverage=trace-pc is not supported by compiler)
+ CFLAGS_KCOV =
+ endif
+endif
+
ifeq ($(cc-name),clang)
+ifneq ($(CROSS_COMPILE),)
+CLANG_TRIPLE ?= $(CROSS_COMPILE)
+CLANG_TARGET := --target=$(notdir $(CLANG_TRIPLE:%-=%))
+GCC_TOOLCHAIN := $(realpath $(dir $(shell which $(LD)))/..)
+endif
+ifneq ($(GCC_TOOLCHAIN),)
+CLANG_GCC_TC := --gcc-toolchain=$(GCC_TOOLCHAIN)
+endif
+KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
+KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
-KBUILD_CPPFLAGS += $(call cc-option,-Wno-unknown-warning-option,)
KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
+KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
+KBUILD_CFLAGS += $(call cc-disable-warning, duplicate-decl-specifier)
# Quiet clang warning: comparison of unsigned expression < 0 is always false
KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the
@@ -709,6 +729,8 @@
# See modpost pattern 2
KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
+KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
+KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
else
# These warnings generated too much noise in a regular build.
@@ -1018,7 +1040,7 @@
archprepare: archheaders archscripts prepare1 scripts_basic
-prepare0: archprepare FORCE
+prepare0: archprepare
$(Q)$(MAKE) $(build)=.
# All the preparing..
@@ -1063,7 +1085,7 @@
export INSTALL_FW_PATH
PHONY += firmware_install
-firmware_install: FORCE
+firmware_install:
@mkdir -p $(objtree)/firmware
$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.fwinst obj=firmware __fw_install
@@ -1083,7 +1105,7 @@
archscripts:
PHONY += __headers
-__headers: $(version_h) scripts_basic asm-generic archheaders archscripts FORCE
+__headers: $(version_h) scripts_basic asm-generic archheaders archscripts
$(Q)$(MAKE) $(build)=scripts build_unifdef
PHONY += headers_install_all
@@ -1296,6 +1318,8 @@
@echo ' (default: $$(INSTALL_MOD_PATH)/lib/firmware)'
@echo ' dir/ - Build all files in dir and below'
@echo ' dir/file.[ois] - Build specified target only'
+ @echo ' dir/file.ll - Build the LLVM assembly file'
+ @echo ' (requires compiler support for LLVM assembly generation)'
@echo ' dir/file.lst - Build specified mixed source/assembly target only'
@echo ' (requires a recent binutils and recent build (System.map))'
@echo ' dir/file.ko - Build module including final link'
@@ -1471,6 +1495,7 @@
-o -name '.*.d' -o -name '.*.tmp' -o -name '*.mod.c' \
-o -name '*.symtypes' -o -name 'modules.order' \
-o -name modules.builtin -o -name '.tmp_*.o.*' \
+ -o -name '*.ll' \
-o -name '*.gcno' \) -type f -print | xargs rm -f
# Generate tags for editors
@@ -1574,6 +1599,8 @@
$(Q)$(MAKE) $(build)=$(build-dir) $(target-dir)$(notdir $@)
%.symtypes: %.c prepare scripts FORCE
$(Q)$(MAKE) $(build)=$(build-dir) $(target-dir)$(notdir $@)
+%.ll: %.c prepare scripts FORCE
+ $(Q)$(MAKE) $(build)=$(build-dir) $(target-dir)$(notdir $@)
# Modules
/: prepare scripts FORCE
diff --git a/android/configs/android-fetch-configs.sh b/android/configs/android-fetch-configs.sh
new file mode 100755
index 0000000..9915c13
--- /dev/null
+++ b/android/configs/android-fetch-configs.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+curl https://android.googlesource.com/kernel/configs/+archive/master/android-4.4.tar.gz | tar xzv
+
diff --git a/arch/Kconfig b/arch/Kconfig
index 4e949e5..a29c609 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -225,8 +225,8 @@
config ARCH_TASK_STRUCT_ALLOCATOR
bool
-# Select if arch has its private alloc_thread_info() function
-config ARCH_THREAD_INFO_ALLOCATOR
+# Select if arch has its private alloc_thread_stack() function
+config ARCH_THREAD_STACK_ALLOCATOR
bool
# Select if arch wants to size task_struct dynamically via arch_task_struct_size:
@@ -423,6 +423,15 @@
endchoice
+config HAVE_ARCH_WITHIN_STACK_FRAMES
+ bool
+ help
+ An architecture should select this if it can walk the kernel stack
+ frames to determine if an object is part of either the arguments
+ or local variables (i.e. that it excludes saved return addresses,
+ and similar) by implementing an inline arch_within_stack_frames(),
+ which is used by CONFIG_HARDENED_USERCOPY.
+
config HAVE_CONTEXT_TRACKING
bool
help
@@ -518,6 +527,79 @@
normal C parameter passing, rather than extracting the syscall
argument from pt_regs.
+config HAVE_ARCH_MMAP_RND_BITS
+ bool
+ help
+ An arch should select this symbol if it supports setting a variable
+ number of bits for use in establishing the base address for mmap
+ allocations, has MMU enabled and provides values for both:
+ - ARCH_MMAP_RND_BITS_MIN
+ - ARCH_MMAP_RND_BITS_MAX
+
+config HAVE_EXIT_THREAD
+ bool
+ help
+ An architecture implements exit_thread.
+
+config ARCH_MMAP_RND_BITS_MIN
+ int
+
+config ARCH_MMAP_RND_BITS_MAX
+ int
+
+config ARCH_MMAP_RND_BITS_DEFAULT
+ int
+
+config ARCH_MMAP_RND_BITS
+ int "Number of bits to use for ASLR of mmap base address" if EXPERT
+ range ARCH_MMAP_RND_BITS_MIN ARCH_MMAP_RND_BITS_MAX
+ default ARCH_MMAP_RND_BITS_DEFAULT if ARCH_MMAP_RND_BITS_DEFAULT
+ default ARCH_MMAP_RND_BITS_MIN
+ depends on HAVE_ARCH_MMAP_RND_BITS
+ help
+ This value can be used to select the number of bits to use to
+ determine the random offset to the base address of vma regions
+ resulting from mmap allocations. This value will be bounded
+ by the architecture's minimum and maximum supported values.
+
+ This value can be changed after boot using the
+ /proc/sys/vm/mmap_rnd_bits tunable
+
+config HAVE_ARCH_MMAP_RND_COMPAT_BITS
+ bool
+ help
+ An arch should select this symbol if it supports running applications
+ in compatibility mode, supports setting a variable number of bits for
+ use in establishing the base address for mmap allocations, has MMU
+ enabled and provides values for both:
+ - ARCH_MMAP_RND_COMPAT_BITS_MIN
+ - ARCH_MMAP_RND_COMPAT_BITS_MAX
+
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+ int
+
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+ int
+
+config ARCH_MMAP_RND_COMPAT_BITS_DEFAULT
+ int
+
+config ARCH_MMAP_RND_COMPAT_BITS
+ int "Number of bits to use for ASLR of mmap base address for compatible applications" if EXPERT
+ range ARCH_MMAP_RND_COMPAT_BITS_MIN ARCH_MMAP_RND_COMPAT_BITS_MAX
+ default ARCH_MMAP_RND_COMPAT_BITS_DEFAULT if ARCH_MMAP_RND_COMPAT_BITS_DEFAULT
+ default ARCH_MMAP_RND_COMPAT_BITS_MIN
+ depends on HAVE_ARCH_MMAP_RND_COMPAT_BITS
+ help
+ This value can be used to select the number of bits to use to
+ determine the random offset to the base address of vma regions
+ resulting from mmap allocations for compatible applications This
+ value will be bounded by the architecture's minimum and maximum
+ supported values.
+
+ This value can be changed after boot using the
+ /proc/sys/vm/mmap_rnd_compat_bits tunable
+
#
# ABI hall of shame
#
diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 8095fb2..60c17b9 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -210,14 +210,6 @@
}
EXPORT_SYMBOL(start_thread);
-/*
- * Free current thread data structures etc..
- */
-void
-exit_thread(void)
-{
-}
-
void
flush_thread(void)
{
diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
index a3f750e..b5db9e7 100644
--- a/arch/arc/kernel/process.c
+++ b/arch/arc/kernel/process.c
@@ -183,13 +183,6 @@
{
}
-/*
- * Free any architecture-specific thread data structures, etc.
- */
-void exit_thread(void)
-{
-}
-
int dump_fpu(struct pt_regs *regs, elf_fpregset_t *fpu)
{
return 0;
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 34e1569..b7e2f822 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -33,10 +33,13 @@
select HARDIRQS_SW_RESEND
select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
+ select HAVE_ARCH_HARDENED_USERCOPY
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32
select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32
+ select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
select HAVE_ARCH_TRACEHOOK
+ select HAVE_ARM_SMCCC if CPU_V7
select HAVE_BPF_JIT
select HAVE_CC_STACKPROTECTOR
select HAVE_CONTEXT_TRACKING
@@ -47,6 +50,7 @@
select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
+ select HAVE_EXIT_THREAD
select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL)
select HAVE_FUNCTION_GRAPH_TRACER if (!THUMB2_KERNEL)
select HAVE_FUNCTION_TRACER if (!XIP_KERNEL)
@@ -308,6 +312,14 @@
Select if you want MMU-based virtualised addressing space
support by paged memory management. If unsure, say 'Y'.
+config ARCH_MMAP_RND_BITS_MIN
+ default 8
+
+config ARCH_MMAP_RND_BITS_MAX
+ default 14 if PAGE_OFFSET=0x40000000
+ default 15 if PAGE_OFFSET=0x80000000
+ default 16
+
#
# The "ARM system type" choice list is ordered alphabetically by option
# text. Please add new entries in the option alphabetic order.
@@ -1481,7 +1493,7 @@
config ARM_PSCI
bool "Support for the ARM Power State Coordination Interface (PSCI)"
- depends on CPU_V7
+ depends on HAVE_ARM_SMCCC
select ARM_PSCI_FW
help
Say Y here if you want Linux to communicate with system firmware
@@ -1816,6 +1828,15 @@
help
Say Y if you want to run Linux in a Virtual Machine on Xen on ARM.
+config ARM_FLUSH_CONSOLE_ON_RESTART
+ bool "Force flush the console on restart"
+ help
+ If the console is locked while the system is rebooted, the messages
+ in the temporary logbuffer would not have propogated to all the
+ console drivers. This option forces the console lock to be
+ released if it failed to be acquired, which will cause all the
+ pending messages to be flushed.
+
endmenu
menu "Boot options"
@@ -1844,6 +1865,21 @@
This was deprecated in 2001 and announced to live on for 5 years.
Some old boot loaders still use this way.
+config BUILD_ARM_APPENDED_DTB_IMAGE
+ bool "Build a concatenated zImage/dtb by default"
+ depends on OF
+ help
+ Enabling this option will cause a concatenated zImage and list of
+ DTBs to be built by default (instead of a standalone zImage.)
+ The image will built in arch/arm/boot/zImage-dtb
+
+config BUILD_ARM_APPENDED_DTB_IMAGE_NAMES
+ string "Default dtb names"
+ depends on BUILD_ARM_APPENDED_DTB_IMAGE
+ help
+ Space separated list of names of dtbs to append when
+ building a concatenated zImage-dtb.
+
# Compressed boot loader in ROM. Yes, we really want to ask about
# TEXT and BSS so we preserve their values in the config files.
config ZBOOT_ROM_TEXT
diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index ddbb361..f08e4fd 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -1618,6 +1618,14 @@
kernel low-level debugging functions. Add earlyprintk to your
kernel parameters to enable this console.
+config EARLY_PRINTK_DIRECT
+ bool "Early printk direct"
+ depends on DEBUG_LL
+ help
+ Say Y here if you want to have an early console using the
+ kernel low-level debugging functions and EARLY_PRINTK is
+ not early enough.
+
config ARM_KPROBES_TEST
tristate "Kprobes test module"
depends on KPROBES && MODULES
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 2c2b28e..88e479c 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -296,6 +296,8 @@
# Default target when executing plain make
ifeq ($(CONFIG_XIP_KERNEL),y)
KBUILD_IMAGE := xipImage
+else ifeq ($(CONFIG_BUILD_ARM_APPENDED_DTB_IMAGE),y)
+KBUILD_IMAGE := zImage-dtb
else
KBUILD_IMAGE := zImage
endif
@@ -346,6 +348,9 @@
$(Q)$(MAKE) $(build)=arch/arm/vdso $@
endif
+zImage-dtb: vmlinux scripts dtbs
+ $(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/$@
+
# We use MRPROPER_FILES and CLEAN_FILES now
archclean:
$(Q)$(MAKE) $(clean)=$(boot)
diff --git a/arch/arm/boot/.gitignore b/arch/arm/boot/.gitignore
index 3c79f85..ad7a025 100644
--- a/arch/arm/boot/.gitignore
+++ b/arch/arm/boot/.gitignore
@@ -4,3 +4,4 @@
bootpImage
uImage
*.dtb
+zImage-dtb
\ No newline at end of file
diff --git a/arch/arm/boot/Makefile b/arch/arm/boot/Makefile
index 9eca7ae..4a04ed3d 100644
--- a/arch/arm/boot/Makefile
+++ b/arch/arm/boot/Makefile
@@ -14,6 +14,7 @@
ifneq ($(MACHINE),)
include $(MACHINE)/Makefile.boot
endif
+include $(srctree)/arch/arm/boot/dts/Makefile
# Note: the following conditions must always be true:
# ZRELADDR == virt_to_phys(PAGE_OFFSET + TEXT_OFFSET)
@@ -27,6 +28,14 @@
targets := Image zImage xipImage bootpImage uImage
+DTB_NAMES := $(subst $\",,$(CONFIG_BUILD_ARM_APPENDED_DTB_IMAGE_NAMES))
+ifneq ($(DTB_NAMES),)
+DTB_LIST := $(addsuffix .dtb,$(DTB_NAMES))
+else
+DTB_LIST := $(dtb-y)
+endif
+DTB_OBJS := $(addprefix $(obj)/dts/,$(DTB_LIST))
+
ifeq ($(CONFIG_XIP_KERNEL),y)
$(obj)/xipImage: vmlinux FORCE
@@ -55,6 +64,10 @@
$(call if_changed,objcopy)
@$(kecho) ' Kernel: $@ is ready'
+$(obj)/zImage-dtb: $(obj)/zImage $(DTB_OBJS) FORCE
+ $(call if_changed,cat)
+ @echo ' Kernel: $@ is ready'
+
endif
ifneq ($(LOADADDR),)
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 8569137..d2e43b0 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -778,6 +778,8 @@
bic r6, r6, #1 << 31 @ 32-bit translation system
bic r6, r6, #(7 << 0) | (1 << 4) @ use only ttbr0
mcrne p15, 0, r3, c2, c0, 0 @ load page table pointer
+ mcrne p15, 0, r0, c8, c7, 0 @ flush I,D TLBs
+ mcr p15, 0, r0, c7, c5, 4 @ ISB
mcrne p15, 0, r1, c3, c0, 0 @ load domain access control
mcrne p15, 0, r6, c2, c0, 2 @ load ttb control
#endif
diff --git a/arch/arm/boot/compressed/string.c b/arch/arm/boot/compressed/string.c
index 36e53ef..6894674 100644
--- a/arch/arm/boot/compressed/string.c
+++ b/arch/arm/boot/compressed/string.c
@@ -65,6 +65,15 @@
return sc - s;
}
+size_t strnlen(const char *s, size_t count)
+{
+ const char *sc;
+
+ for (sc = s; count-- && *sc != '\0'; ++sc)
+ /* nothing */;
+ return sc - s;
+}
+
int memcmp(const void *cs, const void *ct, size_t count)
{
const unsigned char *su1 = cs, *su2 = ct, *end = su1 + count;
diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 30bbc37..97d1b37 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -782,5 +782,15 @@
dtstree := $(srctree)/$(src)
dtb-$(CONFIG_OF_ALL_DTBS) := $(patsubst $(dtstree)/%.dts,%.dtb, $(wildcard $(dtstree)/*.dts))
-always := $(dtb-y)
+DTB_NAMES := $(subst $\",,$(CONFIG_BUILD_ARM_APPENDED_DTB_IMAGE_NAMES))
+ifneq ($(DTB_NAMES),)
+DTB_LIST := $(addsuffix .dtb,$(DTB_NAMES))
+else
+DTB_LIST := $(dtb-y)
+endif
+
+targets += dtbs dtbs_install
+targets += $(DTB_LIST)
+
+always := $(DTB_LIST)
clean-files := *.dtb
diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index 9353184..ce01364 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -17,3 +17,7 @@
config SHARP_SCOOP
bool
+
+config FIQ_GLUE
+ bool
+ select FIQ
diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index 27f23b1..04aca89 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -4,6 +4,7 @@
obj-y += firmware.o
+obj-$(CONFIG_FIQ_GLUE) += fiq_glue.o fiq_glue_setup.o
obj-$(CONFIG_ICST) += icst.o
obj-$(CONFIG_SA1111) += sa1111.o
obj-$(CONFIG_DMABOUNCE) += dmabounce.o
diff --git a/arch/arm/common/fiq_glue.S b/arch/arm/common/fiq_glue.S
new file mode 100644
index 0000000..24b42ce
--- /dev/null
+++ b/arch/arm/common/fiq_glue.S
@@ -0,0 +1,118 @@
+/*
+ * Copyright (C) 2008 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ .text
+
+ .global fiq_glue_end
+
+ /* fiq stack: r0-r15,cpsr,spsr of interrupted mode */
+
+ENTRY(fiq_glue)
+ /* store pc, cpsr from previous mode, reserve space for spsr */
+ mrs r12, spsr
+ sub lr, lr, #4
+ subs r10, #1
+ bne nested_fiq
+
+ str r12, [sp, #-8]!
+ str lr, [sp, #-4]!
+
+ /* store r8-r14 from previous mode */
+ sub sp, sp, #(7 * 4)
+ stmia sp, {r8-r14}^
+ nop
+
+ /* store r0-r7 from previous mode */
+ stmfd sp!, {r0-r7}
+
+ /* setup func(data,regs) arguments */
+ mov r0, r9
+ mov r1, sp
+ mov r3, r8
+
+ mov r7, sp
+
+ /* Get sp and lr from non-user modes */
+ and r4, r12, #MODE_MASK
+ cmp r4, #USR_MODE
+ beq fiq_from_usr_mode
+
+ mov r7, sp
+ orr r4, r4, #(PSR_I_BIT | PSR_F_BIT)
+ msr cpsr_c, r4
+ str sp, [r7, #(4 * 13)]
+ str lr, [r7, #(4 * 14)]
+ mrs r5, spsr
+ str r5, [r7, #(4 * 17)]
+
+ cmp r4, #(SVC_MODE | PSR_I_BIT | PSR_F_BIT)
+ /* use fiq stack if we reenter this mode */
+ subne sp, r7, #(4 * 3)
+
+fiq_from_usr_mode:
+ msr cpsr_c, #(SVC_MODE | PSR_I_BIT | PSR_F_BIT)
+ mov r2, sp
+ sub sp, r7, #12
+ stmfd sp!, {r2, ip, lr}
+ /* call func(data,regs) */
+ blx r3
+ ldmfd sp, {r2, ip, lr}
+ mov sp, r2
+
+ /* restore/discard saved state */
+ cmp r4, #USR_MODE
+ beq fiq_from_usr_mode_exit
+
+ msr cpsr_c, r4
+ ldr sp, [r7, #(4 * 13)]
+ ldr lr, [r7, #(4 * 14)]
+ msr spsr_cxsf, r5
+
+fiq_from_usr_mode_exit:
+ msr cpsr_c, #(FIQ_MODE | PSR_I_BIT | PSR_F_BIT)
+
+ ldmfd sp!, {r0-r7}
+ ldr lr, [sp, #(4 * 7)]
+ ldr r12, [sp, #(4 * 8)]
+ add sp, sp, #(10 * 4)
+exit_fiq:
+ msr spsr_cxsf, r12
+ add r10, #1
+ cmp r11, #0
+ moveqs pc, lr
+ bx r11 /* jump to custom fiq return function */
+
+nested_fiq:
+ orr r12, r12, #(PSR_F_BIT)
+ b exit_fiq
+
+fiq_glue_end:
+
+ENTRY(fiq_glue_setup) /* func, data, sp, smc call number */
+ stmfd sp!, {r4}
+ mrs r4, cpsr
+ msr cpsr_c, #(FIQ_MODE | PSR_I_BIT | PSR_F_BIT)
+ movs r8, r0
+ mov r9, r1
+ mov sp, r2
+ mov r11, r3
+ moveq r10, #0
+ movne r10, #1
+ msr cpsr_c, r4
+ ldmfd sp!, {r4}
+ bx lr
+
diff --git a/arch/arm/common/fiq_glue_setup.c b/arch/arm/common/fiq_glue_setup.c
new file mode 100644
index 0000000..8cb1b61
--- /dev/null
+++ b/arch/arm/common/fiq_glue_setup.c
@@ -0,0 +1,147 @@
+/*
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/percpu.h>
+#include <linux/slab.h>
+#include <asm/fiq.h>
+#include <asm/fiq_glue.h>
+
+extern unsigned char fiq_glue, fiq_glue_end;
+extern void fiq_glue_setup(void *func, void *data, void *sp,
+ fiq_return_handler_t fiq_return_handler);
+
+static struct fiq_handler fiq_debbuger_fiq_handler = {
+ .name = "fiq_glue",
+};
+DEFINE_PER_CPU(void *, fiq_stack);
+static struct fiq_glue_handler *current_handler;
+static fiq_return_handler_t fiq_return_handler;
+static DEFINE_MUTEX(fiq_glue_lock);
+
+static void fiq_glue_setup_helper(void *info)
+{
+ struct fiq_glue_handler *handler = info;
+ fiq_glue_setup(handler->fiq, handler,
+ __get_cpu_var(fiq_stack) + THREAD_START_SP,
+ fiq_return_handler);
+}
+
+int fiq_glue_register_handler(struct fiq_glue_handler *handler)
+{
+ int ret;
+ int cpu;
+
+ if (!handler || !handler->fiq)
+ return -EINVAL;
+
+ mutex_lock(&fiq_glue_lock);
+ if (fiq_stack) {
+ ret = -EBUSY;
+ goto err_busy;
+ }
+
+ for_each_possible_cpu(cpu) {
+ void *stack;
+ stack = (void *)__get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER);
+ if (WARN_ON(!stack)) {
+ ret = -ENOMEM;
+ goto err_alloc_fiq_stack;
+ }
+ per_cpu(fiq_stack, cpu) = stack;
+ }
+
+ ret = claim_fiq(&fiq_debbuger_fiq_handler);
+ if (WARN_ON(ret))
+ goto err_claim_fiq;
+
+ current_handler = handler;
+ on_each_cpu(fiq_glue_setup_helper, handler, true);
+ set_fiq_handler(&fiq_glue, &fiq_glue_end - &fiq_glue);
+
+ mutex_unlock(&fiq_glue_lock);
+ return 0;
+
+err_claim_fiq:
+err_alloc_fiq_stack:
+ for_each_possible_cpu(cpu) {
+ __free_pages(per_cpu(fiq_stack, cpu), THREAD_SIZE_ORDER);
+ per_cpu(fiq_stack, cpu) = NULL;
+ }
+err_busy:
+ mutex_unlock(&fiq_glue_lock);
+ return ret;
+}
+
+static void fiq_glue_update_return_handler(void (*fiq_return)(void))
+{
+ fiq_return_handler = fiq_return;
+ if (current_handler)
+ on_each_cpu(fiq_glue_setup_helper, current_handler, true);
+}
+
+int fiq_glue_set_return_handler(void (*fiq_return)(void))
+{
+ int ret;
+
+ mutex_lock(&fiq_glue_lock);
+ if (fiq_return_handler) {
+ ret = -EBUSY;
+ goto err_busy;
+ }
+ fiq_glue_update_return_handler(fiq_return);
+ ret = 0;
+err_busy:
+ mutex_unlock(&fiq_glue_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(fiq_glue_set_return_handler);
+
+int fiq_glue_clear_return_handler(void (*fiq_return)(void))
+{
+ int ret;
+
+ mutex_lock(&fiq_glue_lock);
+ if (WARN_ON(fiq_return_handler != fiq_return)) {
+ ret = -EINVAL;
+ goto err_inval;
+ }
+ fiq_glue_update_return_handler(NULL);
+ ret = 0;
+err_inval:
+ mutex_unlock(&fiq_glue_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(fiq_glue_clear_return_handler);
+
+/**
+ * fiq_glue_resume - Restore fiqs after suspend or low power idle states
+ *
+ * This must be called before calling local_fiq_enable after returning from a
+ * power state where the fiq mode registers were lost. If a driver provided
+ * a resume hook when it registered the handler it will be called.
+ */
+
+void fiq_glue_resume(void)
+{
+ if (!current_handler)
+ return;
+ fiq_glue_setup(current_handler->fiq, current_handler,
+ __get_cpu_var(fiq_stack) + THREAD_START_SP,
+ fiq_return_handler);
+ if (current_handler->resume)
+ current_handler->resume(current_handler);
+}
+
diff --git a/arch/arm/configs/ranchu_defconfig b/arch/arm/configs/ranchu_defconfig
new file mode 100644
index 0000000..d5a16df
--- /dev/null
+++ b/arch/arm/configs/ranchu_defconfig
@@ -0,0 +1,317 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_AUDIT=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_LOG_BUF_SHIFT=14
+CONFIG_CGROUPS=y
+CONFIG_CGROUP_DEBUG=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CPUSETS=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_KALLSYMS_ALL=y
+CONFIG_EMBEDDED=y
+CONFIG_PROFILING=y
+CONFIG_OPROFILE=y
+CONFIG_ARCH_MMAP_RND_BITS=16
+# CONFIG_BLK_DEV_BSG is not set
+# CONFIG_IOSCHED_DEADLINE is not set
+# CONFIG_IOSCHED_CFQ is not set
+CONFIG_ARCH_VIRT=y
+CONFIG_ARM_KERNMEM_PERMS=y
+CONFIG_SMP=y
+CONFIG_PREEMPT=y
+CONFIG_AEABI=y
+CONFIG_HIGHMEM=y
+CONFIG_KSM=y
+CONFIG_SECCOMP=y
+CONFIG_CMDLINE="console=ttyAMA0"
+CONFIG_VFP=y
+CONFIG_NEON=y
+# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
+CONFIG_PM_AUTOSLEEP=y
+CONFIG_PM_WAKELOCKS=y
+CONFIG_PM_WAKELOCKS_LIMIT=0
+# CONFIG_PM_WAKELOCKS_GC is not set
+CONFIG_PM_DEBUG=y
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_XFRM_USER=y
+CONFIG_NET_KEY=y
+CONFIG_INET=y
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_INET_ESP=y
+# CONFIG_INET_LRO is not set
+CONFIG_IPV6_ROUTER_PREF=y
+CONFIG_IPV6_ROUTE_INFO=y
+CONFIG_IPV6_OPTIMISTIC_DAD=y
+CONFIG_INET6_AH=y
+CONFIG_INET6_ESP=y
+CONFIG_INET6_IPCOMP=y
+CONFIG_IPV6_MIP6=y
+CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_NETFILTER=y
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=y
+CONFIG_NF_CONNTRACK_FTP=y
+CONFIG_NF_CONNTRACK_H323=y
+CONFIG_NF_CONNTRACK_IRC=y
+CONFIG_NF_CONNTRACK_NETBIOS_NS=y
+CONFIG_NF_CONNTRACK_PPTP=y
+CONFIG_NF_CONNTRACK_SANE=y
+CONFIG_NF_CONNTRACK_TFTP=y
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=y
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=y
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y
+CONFIG_NETFILTER_XT_TARGET_MARK=y
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+CONFIG_NETFILTER_XT_TARGET_TPROXY=y
+CONFIG_NETFILTER_XT_TARGET_TRACE=y
+CONFIG_NETFILTER_XT_TARGET_SECMARK=y
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=y
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=y
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=y
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=y
+CONFIG_NETFILTER_XT_MATCH_HELPER=y
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=y
+CONFIG_NETFILTER_XT_MATCH_LENGTH=y
+CONFIG_NETFILTER_XT_MATCH_LIMIT=y
+CONFIG_NETFILTER_XT_MATCH_MAC=y
+CONFIG_NETFILTER_XT_MATCH_MARK=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=y
+CONFIG_NETFILTER_XT_MATCH_QTAGUID=y
+CONFIG_NETFILTER_XT_MATCH_QUOTA=y
+CONFIG_NETFILTER_XT_MATCH_QUOTA2=y
+CONFIG_NETFILTER_XT_MATCH_SOCKET=y
+CONFIG_NETFILTER_XT_MATCH_STATE=y
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
+CONFIG_NETFILTER_XT_MATCH_STRING=y
+CONFIG_NETFILTER_XT_MATCH_TIME=y
+CONFIG_NETFILTER_XT_MATCH_U32=y
+CONFIG_NF_CONNTRACK_IPV4=y
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=y
+CONFIG_IP_NF_MATCH_ECN=y
+CONFIG_IP_NF_MATCH_TTL=y
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_RAW=y
+CONFIG_IP_NF_SECURITY=y
+CONFIG_IP_NF_ARPTABLES=y
+CONFIG_IP_NF_ARPFILTER=y
+CONFIG_IP_NF_ARP_MANGLE=y
+CONFIG_NF_CONNTRACK_IPV6=y
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_MANGLE=y
+CONFIG_IP6_NF_RAW=y
+CONFIG_BRIDGE=y
+CONFIG_NET_SCHED=y
+CONFIG_NET_SCH_HTB=y
+CONFIG_NET_CLS_U32=y
+CONFIG_NET_EMATCH=y
+CONFIG_NET_EMATCH_U32=y
+CONFIG_NET_CLS_ACT=y
+# CONFIG_WIRELESS is not set
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_MTD=y
+CONFIG_MTD_CMDLINE_PARTS=y
+CONFIG_MTD_BLOCK=y
+CONFIG_MTD_CFI=y
+CONFIG_MTD_CFI_INTELEXT=y
+CONFIG_MTD_CFI_AMDSTD=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_SIZE=8192
+CONFIG_VIRTIO_BLK=y
+CONFIG_MD=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_CRYPT=y
+CONFIG_DM_UEVENT=y
+CONFIG_DM_VERITY=y
+CONFIG_DM_VERITY_FEC=y
+CONFIG_NETDEVICES=y
+CONFIG_TUN=y
+CONFIG_VIRTIO_NET=y
+CONFIG_SMSC911X=y
+CONFIG_PPP=y
+CONFIG_PPP_BSDCOMP=y
+CONFIG_PPP_DEFLATE=y
+CONFIG_PPP_MPPE=y
+CONFIG_PPPOLAC=y
+CONFIG_PPPOPNS=y
+CONFIG_USB_USBNET=y
+# CONFIG_WLAN is not set
+CONFIG_INPUT_EVDEV=y
+CONFIG_INPUT_KEYRESET=y
+CONFIG_KEYBOARD_GOLDFISH_EVENTS=y
+CONFIG_KEYBOARD_GOLDFISH_ROTARY=y
+# CONFIG_INPUT_MOUSE is not set
+CONFIG_INPUT_JOYSTICK=y
+CONFIG_JOYSTICK_XPAD=y
+CONFIG_JOYSTICK_XPAD_FF=y
+CONFIG_JOYSTICK_XPAD_LEDS=y
+CONFIG_INPUT_TABLET=y
+CONFIG_TABLET_USB_ACECAD=y
+CONFIG_TABLET_USB_AIPTEK=y
+CONFIG_TABLET_USB_GTCO=y
+CONFIG_TABLET_USB_HANWANG=y
+CONFIG_TABLET_USB_KBTAB=y
+CONFIG_INPUT_MISC=y
+CONFIG_INPUT_KEYCHORD=y
+CONFIG_INPUT_UINPUT=y
+CONFIG_INPUT_GPIO=y
+# CONFIG_SERIO_SERPORT is not set
+CONFIG_SERIO_AMBAKMI=y
+# CONFIG_VT is not set
+# CONFIG_LEGACY_PTYS is not set
+# CONFIG_DEVMEM is not set
+# CONFIG_DEVKMEM is not set
+CONFIG_SERIAL_AMBA_PL011=y
+CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+CONFIG_VIRTIO_CONSOLE=y
+# CONFIG_HW_RANDOM is not set
+# CONFIG_HWMON is not set
+CONFIG_MEDIA_SUPPORT=y
+CONFIG_FB=y
+CONFIG_FB_GOLDFISH=y
+CONFIG_FB_SIMPLE=y
+CONFIG_BACKLIGHT_LCD_SUPPORT=y
+CONFIG_LOGO=y
+# CONFIG_LOGO_LINUX_MONO is not set
+# CONFIG_LOGO_LINUX_VGA16 is not set
+CONFIG_SOUND=y
+CONFIG_SND=y
+CONFIG_HIDRAW=y
+CONFIG_UHID=y
+CONFIG_HID_A4TECH=y
+CONFIG_HID_ACRUX=y
+CONFIG_HID_ACRUX_FF=y
+CONFIG_HID_APPLE=y
+CONFIG_HID_BELKIN=y
+CONFIG_HID_CHERRY=y
+CONFIG_HID_CHICONY=y
+CONFIG_HID_PRODIKEYS=y
+CONFIG_HID_CYPRESS=y
+CONFIG_HID_DRAGONRISE=y
+CONFIG_DRAGONRISE_FF=y
+CONFIG_HID_EMS_FF=y
+CONFIG_HID_ELECOM=y
+CONFIG_HID_EZKEY=y
+CONFIG_HID_HOLTEK=y
+CONFIG_HID_KEYTOUCH=y
+CONFIG_HID_KYE=y
+CONFIG_HID_UCLOGIC=y
+CONFIG_HID_WALTOP=y
+CONFIG_HID_GYRATION=y
+CONFIG_HID_TWINHAN=y
+CONFIG_HID_KENSINGTON=y
+CONFIG_HID_LCPOWER=y
+CONFIG_HID_LOGITECH=y
+CONFIG_HID_LOGITECH_DJ=y
+CONFIG_LOGITECH_FF=y
+CONFIG_LOGIRUMBLEPAD2_FF=y
+CONFIG_LOGIG940_FF=y
+CONFIG_HID_MAGICMOUSE=y
+CONFIG_HID_MICROSOFT=y
+CONFIG_HID_MONTEREY=y
+CONFIG_HID_MULTITOUCH=y
+CONFIG_HID_NTRIG=y
+CONFIG_HID_ORTEK=y
+CONFIG_HID_PANTHERLORD=y
+CONFIG_PANTHERLORD_FF=y
+CONFIG_HID_PETALYNX=y
+CONFIG_HID_PICOLCD=y
+CONFIG_HID_PRIMAX=y
+CONFIG_HID_ROCCAT=y
+CONFIG_HID_SAITEK=y
+CONFIG_HID_SAMSUNG=y
+CONFIG_HID_SONY=y
+CONFIG_HID_SPEEDLINK=y
+CONFIG_HID_SUNPLUS=y
+CONFIG_HID_GREENASIA=y
+CONFIG_GREENASIA_FF=y
+CONFIG_HID_SMARTJOYPLUS=y
+CONFIG_SMARTJOYPLUS_FF=y
+CONFIG_HID_TIVO=y
+CONFIG_HID_TOPSEED=y
+CONFIG_HID_THRUSTMASTER=y
+CONFIG_HID_WACOM=y
+CONFIG_HID_WIIMOTE=y
+CONFIG_HID_ZEROPLUS=y
+CONFIG_HID_ZYDACRON=y
+CONFIG_USB_HIDDEV=y
+CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_OTG_WAKELOCK=y
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_DRV_PL031=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_STAGING=y
+CONFIG_ASHMEM=y
+CONFIG_ANDROID_LOW_MEMORY_KILLER=y
+CONFIG_SYNC=y
+CONFIG_SW_SYNC=y
+CONFIG_SW_SYNC_USER=y
+CONFIG_ION=y
+CONFIG_GOLDFISH_AUDIO=y
+CONFIG_GOLDFISH=y
+CONFIG_GOLDFISH_PIPE=y
+CONFIG_ANDROID=y
+CONFIG_ANDROID_BINDER_IPC=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_SECURITY=y
+CONFIG_QUOTA=y
+CONFIG_FUSE_FS=y
+CONFIG_CUSE=y
+CONFIG_MSDOS_FS=y
+CONFIG_VFAT_FS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_PSTORE=y
+CONFIG_PSTORE_CONSOLE=y
+CONFIG_PSTORE_RAM=y
+CONFIG_NFS_FS=y
+CONFIG_ROOT_NFS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_DEBUG_INFO=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_PANIC_TIMEOUT=5
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHEDSTATS=y
+CONFIG_TIMER_STATS=y
+CONFIG_ENABLE_DEFAULT_TRACERS=y
+CONFIG_SECURITY=y
+CONFIG_SECURITY_NETWORK=y
+CONFIG_SECURITY_SELINUX=y
+CONFIG_VIRTUALIZATION=y
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 27ed1b1..72aa5ec 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -120,4 +120,11 @@
that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
that is part of the ARMv8 Crypto Extensions
+config CRYPTO_SPECK_NEON
+ tristate "NEON accelerated Speck cipher algorithms"
+ depends on KERNEL_MODE_NEON
+ select CRYPTO_BLKCIPHER
+ select CRYPTO_GF128MUL
+ select CRYPTO_SPECK
+
endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index fc51507..1d0448a 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -8,6 +8,7 @@
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
+obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -36,6 +37,7 @@
sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
+speck-neon-y := speck-neon-core.o speck-neon-glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index 679c589..1f7b98e 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -369,7 +369,7 @@
.cra_blkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
- .ivsize = AES_BLOCK_SIZE,
+ .ivsize = 0,
.setkey = ce_aes_setkey,
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
@@ -446,7 +446,7 @@
.cra_ablkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
- .ivsize = AES_BLOCK_SIZE,
+ .ivsize = 0,
.setkey = ablk_set_key,
.encrypt = ablk_encrypt,
.decrypt = ablk_decrypt,
diff --git a/arch/arm/crypto/speck-neon-core.S b/arch/arm/crypto/speck-neon-core.S
new file mode 100644
index 0000000..3c1e203
--- /dev/null
+++ b/arch/arm/crypto/speck-neon-core.S
@@ -0,0 +1,432 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
+ *
+ * Copyright (c) 2018 Google, Inc
+ *
+ * Author: Eric Biggers <ebiggers@google.com>
+ */
+
+#include <linux/linkage.h>
+
+ .text
+ .fpu neon
+
+ // arguments
+ ROUND_KEYS .req r0 // const {u64,u32} *round_keys
+ NROUNDS .req r1 // int nrounds
+ DST .req r2 // void *dst
+ SRC .req r3 // const void *src
+ NBYTES .req r4 // unsigned int nbytes
+ TWEAK .req r5 // void *tweak
+
+ // registers which hold the data being encrypted/decrypted
+ X0 .req q0
+ X0_L .req d0
+ X0_H .req d1
+ Y0 .req q1
+ Y0_H .req d3
+ X1 .req q2
+ X1_L .req d4
+ X1_H .req d5
+ Y1 .req q3
+ Y1_H .req d7
+ X2 .req q4
+ X2_L .req d8
+ X2_H .req d9
+ Y2 .req q5
+ Y2_H .req d11
+ X3 .req q6
+ X3_L .req d12
+ X3_H .req d13
+ Y3 .req q7
+ Y3_H .req d15
+
+ // the round key, duplicated in all lanes
+ ROUND_KEY .req q8
+ ROUND_KEY_L .req d16
+ ROUND_KEY_H .req d17
+
+ // index vector for vtbl-based 8-bit rotates
+ ROTATE_TABLE .req d18
+
+ // multiplication table for updating XTS tweaks
+ GF128MUL_TABLE .req d19
+ GF64MUL_TABLE .req d19
+
+ // current XTS tweak value(s)
+ TWEAKV .req q10
+ TWEAKV_L .req d20
+ TWEAKV_H .req d21
+
+ TMP0 .req q12
+ TMP0_L .req d24
+ TMP0_H .req d25
+ TMP1 .req q13
+ TMP2 .req q14
+ TMP3 .req q15
+
+ .align 4
+.Lror64_8_table:
+ .byte 1, 2, 3, 4, 5, 6, 7, 0
+.Lror32_8_table:
+ .byte 1, 2, 3, 0, 5, 6, 7, 4
+.Lrol64_8_table:
+ .byte 7, 0, 1, 2, 3, 4, 5, 6
+.Lrol32_8_table:
+ .byte 3, 0, 1, 2, 7, 4, 5, 6
+.Lgf128mul_table:
+ .byte 0, 0x87
+ .fill 14
+.Lgf64mul_table:
+ .byte 0, 0x1b, (0x1b << 1), (0x1b << 1) ^ 0x1b
+ .fill 12
+
+/*
+ * _speck_round_128bytes() - Speck encryption round on 128 bytes at a time
+ *
+ * Do one Speck encryption round on the 128 bytes (8 blocks for Speck128, 16 for
+ * Speck64) stored in X0-X3 and Y0-Y3, using the round key stored in all lanes
+ * of ROUND_KEY. 'n' is the lane size: 64 for Speck128, or 32 for Speck64.
+ *
+ * The 8-bit rotates are implemented using vtbl instead of vshr + vsli because
+ * the vtbl approach is faster on some processors and the same speed on others.
+ */
+.macro _speck_round_128bytes n
+
+ // x = ror(x, 8)
+ vtbl.8 X0_L, {X0_L}, ROTATE_TABLE
+ vtbl.8 X0_H, {X0_H}, ROTATE_TABLE
+ vtbl.8 X1_L, {X1_L}, ROTATE_TABLE
+ vtbl.8 X1_H, {X1_H}, ROTATE_TABLE
+ vtbl.8 X2_L, {X2_L}, ROTATE_TABLE
+ vtbl.8 X2_H, {X2_H}, ROTATE_TABLE
+ vtbl.8 X3_L, {X3_L}, ROTATE_TABLE
+ vtbl.8 X3_H, {X3_H}, ROTATE_TABLE
+
+ // x += y
+ vadd.u\n X0, Y0
+ vadd.u\n X1, Y1
+ vadd.u\n X2, Y2
+ vadd.u\n X3, Y3
+
+ // x ^= k
+ veor X0, ROUND_KEY
+ veor X1, ROUND_KEY
+ veor X2, ROUND_KEY
+ veor X3, ROUND_KEY
+
+ // y = rol(y, 3)
+ vshl.u\n TMP0, Y0, #3
+ vshl.u\n TMP1, Y1, #3
+ vshl.u\n TMP2, Y2, #3
+ vshl.u\n TMP3, Y3, #3
+ vsri.u\n TMP0, Y0, #(\n - 3)
+ vsri.u\n TMP1, Y1, #(\n - 3)
+ vsri.u\n TMP2, Y2, #(\n - 3)
+ vsri.u\n TMP3, Y3, #(\n - 3)
+
+ // y ^= x
+ veor Y0, TMP0, X0
+ veor Y1, TMP1, X1
+ veor Y2, TMP2, X2
+ veor Y3, TMP3, X3
+.endm
+
+/*
+ * _speck_unround_128bytes() - Speck decryption round on 128 bytes at a time
+ *
+ * This is the inverse of _speck_round_128bytes().
+ */
+.macro _speck_unround_128bytes n
+
+ // y ^= x
+ veor TMP0, Y0, X0
+ veor TMP1, Y1, X1
+ veor TMP2, Y2, X2
+ veor TMP3, Y3, X3
+
+ // y = ror(y, 3)
+ vshr.u\n Y0, TMP0, #3
+ vshr.u\n Y1, TMP1, #3
+ vshr.u\n Y2, TMP2, #3
+ vshr.u\n Y3, TMP3, #3
+ vsli.u\n Y0, TMP0, #(\n - 3)
+ vsli.u\n Y1, TMP1, #(\n - 3)
+ vsli.u\n Y2, TMP2, #(\n - 3)
+ vsli.u\n Y3, TMP3, #(\n - 3)
+
+ // x ^= k
+ veor X0, ROUND_KEY
+ veor X1, ROUND_KEY
+ veor X2, ROUND_KEY
+ veor X3, ROUND_KEY
+
+ // x -= y
+ vsub.u\n X0, Y0
+ vsub.u\n X1, Y1
+ vsub.u\n X2, Y2
+ vsub.u\n X3, Y3
+
+ // x = rol(x, 8);
+ vtbl.8 X0_L, {X0_L}, ROTATE_TABLE
+ vtbl.8 X0_H, {X0_H}, ROTATE_TABLE
+ vtbl.8 X1_L, {X1_L}, ROTATE_TABLE
+ vtbl.8 X1_H, {X1_H}, ROTATE_TABLE
+ vtbl.8 X2_L, {X2_L}, ROTATE_TABLE
+ vtbl.8 X2_H, {X2_H}, ROTATE_TABLE
+ vtbl.8 X3_L, {X3_L}, ROTATE_TABLE
+ vtbl.8 X3_H, {X3_H}, ROTATE_TABLE
+.endm
+
+.macro _xts128_precrypt_one dst_reg, tweak_buf, tmp
+
+ // Load the next source block
+ vld1.8 {\dst_reg}, [SRC]!
+
+ // Save the current tweak in the tweak buffer
+ vst1.8 {TWEAKV}, [\tweak_buf:128]!
+
+ // XOR the next source block with the current tweak
+ veor \dst_reg, TWEAKV
+
+ /*
+ * Calculate the next tweak by multiplying the current one by x,
+ * modulo p(x) = x^128 + x^7 + x^2 + x + 1.
+ */
+ vshr.u64 \tmp, TWEAKV, #63
+ vshl.u64 TWEAKV, #1
+ veor TWEAKV_H, \tmp\()_L
+ vtbl.8 \tmp\()_H, {GF128MUL_TABLE}, \tmp\()_H
+ veor TWEAKV_L, \tmp\()_H
+.endm
+
+.macro _xts64_precrypt_two dst_reg, tweak_buf, tmp
+
+ // Load the next two source blocks
+ vld1.8 {\dst_reg}, [SRC]!
+
+ // Save the current two tweaks in the tweak buffer
+ vst1.8 {TWEAKV}, [\tweak_buf:128]!
+
+ // XOR the next two source blocks with the current two tweaks
+ veor \dst_reg, TWEAKV
+
+ /*
+ * Calculate the next two tweaks by multiplying the current ones by x^2,
+ * modulo p(x) = x^64 + x^4 + x^3 + x + 1.
+ */
+ vshr.u64 \tmp, TWEAKV, #62
+ vshl.u64 TWEAKV, #2
+ vtbl.8 \tmp\()_L, {GF64MUL_TABLE}, \tmp\()_L
+ vtbl.8 \tmp\()_H, {GF64MUL_TABLE}, \tmp\()_H
+ veor TWEAKV, \tmp
+.endm
+
+/*
+ * _speck_xts_crypt() - Speck-XTS encryption/decryption
+ *
+ * Encrypt or decrypt NBYTES bytes of data from the SRC buffer to the DST buffer
+ * using Speck-XTS, specifically the variant with a block size of '2n' and round
+ * count given by NROUNDS. The expanded round keys are given in ROUND_KEYS, and
+ * the current XTS tweak value is given in TWEAK. It's assumed that NBYTES is a
+ * nonzero multiple of 128.
+ */
+.macro _speck_xts_crypt n, decrypting
+ push {r4-r7}
+ mov r7, sp
+
+ /*
+ * The first four parameters were passed in registers r0-r3. Load the
+ * additional parameters, which were passed on the stack.
+ */
+ ldr NBYTES, [sp, #16]
+ ldr TWEAK, [sp, #20]
+
+ /*
+ * If decrypting, modify the ROUND_KEYS parameter to point to the last
+ * round key rather than the first, since for decryption the round keys
+ * are used in reverse order.
+ */
+.if \decrypting
+.if \n == 64
+ add ROUND_KEYS, ROUND_KEYS, NROUNDS, lsl #3
+ sub ROUND_KEYS, #8
+.else
+ add ROUND_KEYS, ROUND_KEYS, NROUNDS, lsl #2
+ sub ROUND_KEYS, #4
+.endif
+.endif
+
+ // Load the index vector for vtbl-based 8-bit rotates
+.if \decrypting
+ ldr r12, =.Lrol\n\()_8_table
+.else
+ ldr r12, =.Lror\n\()_8_table
+.endif
+ vld1.8 {ROTATE_TABLE}, [r12:64]
+
+ // One-time XTS preparation
+
+ /*
+ * Allocate stack space to store 128 bytes worth of tweaks. For
+ * performance, this space is aligned to a 16-byte boundary so that we
+ * can use the load/store instructions that declare 16-byte alignment.
+ */
+ sub sp, #128
+ bic sp, #0xf
+
+.if \n == 64
+ // Load first tweak
+ vld1.8 {TWEAKV}, [TWEAK]
+
+ // Load GF(2^128) multiplication table
+ ldr r12, =.Lgf128mul_table
+ vld1.8 {GF128MUL_TABLE}, [r12:64]
+.else
+ // Load first tweak
+ vld1.8 {TWEAKV_L}, [TWEAK]
+
+ // Load GF(2^64) multiplication table
+ ldr r12, =.Lgf64mul_table
+ vld1.8 {GF64MUL_TABLE}, [r12:64]
+
+ // Calculate second tweak, packing it together with the first
+ vshr.u64 TMP0_L, TWEAKV_L, #63
+ vtbl.u8 TMP0_L, {GF64MUL_TABLE}, TMP0_L
+ vshl.u64 TWEAKV_H, TWEAKV_L, #1
+ veor TWEAKV_H, TMP0_L
+.endif
+
+.Lnext_128bytes_\@:
+
+ /*
+ * Load the source blocks into {X,Y}[0-3], XOR them with their XTS tweak
+ * values, and save the tweaks on the stack for later. Then
+ * de-interleave the 'x' and 'y' elements of each block, i.e. make it so
+ * that the X[0-3] registers contain only the second halves of blocks,
+ * and the Y[0-3] registers contain only the first halves of blocks.
+ * (Speck uses the order (y, x) rather than the more intuitive (x, y).)
+ */
+ mov r12, sp
+.if \n == 64
+ _xts128_precrypt_one X0, r12, TMP0
+ _xts128_precrypt_one Y0, r12, TMP0
+ _xts128_precrypt_one X1, r12, TMP0
+ _xts128_precrypt_one Y1, r12, TMP0
+ _xts128_precrypt_one X2, r12, TMP0
+ _xts128_precrypt_one Y2, r12, TMP0
+ _xts128_precrypt_one X3, r12, TMP0
+ _xts128_precrypt_one Y3, r12, TMP0
+ vswp X0_L, Y0_H
+ vswp X1_L, Y1_H
+ vswp X2_L, Y2_H
+ vswp X3_L, Y3_H
+.else
+ _xts64_precrypt_two X0, r12, TMP0
+ _xts64_precrypt_two Y0, r12, TMP0
+ _xts64_precrypt_two X1, r12, TMP0
+ _xts64_precrypt_two Y1, r12, TMP0
+ _xts64_precrypt_two X2, r12, TMP0
+ _xts64_precrypt_two Y2, r12, TMP0
+ _xts64_precrypt_two X3, r12, TMP0
+ _xts64_precrypt_two Y3, r12, TMP0
+ vuzp.32 Y0, X0
+ vuzp.32 Y1, X1
+ vuzp.32 Y2, X2
+ vuzp.32 Y3, X3
+.endif
+
+ // Do the cipher rounds
+
+ mov r12, ROUND_KEYS
+ mov r6, NROUNDS
+
+.Lnext_round_\@:
+.if \decrypting
+.if \n == 64
+ vld1.64 ROUND_KEY_L, [r12]
+ sub r12, #8
+ vmov ROUND_KEY_H, ROUND_KEY_L
+.else
+ vld1.32 {ROUND_KEY_L[],ROUND_KEY_H[]}, [r12]
+ sub r12, #4
+.endif
+ _speck_unround_128bytes \n
+.else
+.if \n == 64
+ vld1.64 ROUND_KEY_L, [r12]!
+ vmov ROUND_KEY_H, ROUND_KEY_L
+.else
+ vld1.32 {ROUND_KEY_L[],ROUND_KEY_H[]}, [r12]!
+.endif
+ _speck_round_128bytes \n
+.endif
+ subs r6, r6, #1
+ bne .Lnext_round_\@
+
+ // Re-interleave the 'x' and 'y' elements of each block
+.if \n == 64
+ vswp X0_L, Y0_H
+ vswp X1_L, Y1_H
+ vswp X2_L, Y2_H
+ vswp X3_L, Y3_H
+.else
+ vzip.32 Y0, X0
+ vzip.32 Y1, X1
+ vzip.32 Y2, X2
+ vzip.32 Y3, X3
+.endif
+
+ // XOR the encrypted/decrypted blocks with the tweaks we saved earlier
+ mov r12, sp
+ vld1.8 {TMP0, TMP1}, [r12:128]!
+ vld1.8 {TMP2, TMP3}, [r12:128]!
+ veor X0, TMP0
+ veor Y0, TMP1
+ veor X1, TMP2
+ veor Y1, TMP3
+ vld1.8 {TMP0, TMP1}, [r12:128]!
+ vld1.8 {TMP2, TMP3}, [r12:128]!
+ veor X2, TMP0
+ veor Y2, TMP1
+ veor X3, TMP2
+ veor Y3, TMP3
+
+ // Store the ciphertext in the destination buffer
+ vst1.8 {X0, Y0}, [DST]!
+ vst1.8 {X1, Y1}, [DST]!
+ vst1.8 {X2, Y2}, [DST]!
+ vst1.8 {X3, Y3}, [DST]!
+
+ // Continue if there are more 128-byte chunks remaining, else return
+ subs NBYTES, #128
+ bne .Lnext_128bytes_\@
+
+ // Store the next tweak
+.if \n == 64
+ vst1.8 {TWEAKV}, [TWEAK]
+.else
+ vst1.8 {TWEAKV_L}, [TWEAK]
+.endif
+
+ mov sp, r7
+ pop {r4-r7}
+ bx lr
+.endm
+
+ENTRY(speck128_xts_encrypt_neon)
+ _speck_xts_crypt n=64, decrypting=0
+ENDPROC(speck128_xts_encrypt_neon)
+
+ENTRY(speck128_xts_decrypt_neon)
+ _speck_xts_crypt n=64, decrypting=1
+ENDPROC(speck128_xts_decrypt_neon)
+
+ENTRY(speck64_xts_encrypt_neon)
+ _speck_xts_crypt n=32, decrypting=0
+ENDPROC(speck64_xts_encrypt_neon)
+
+ENTRY(speck64_xts_decrypt_neon)
+ _speck_xts_crypt n=32, decrypting=1
+ENDPROC(speck64_xts_decrypt_neon)
diff --git a/arch/arm/crypto/speck-neon-glue.c b/arch/arm/crypto/speck-neon-glue.c
new file mode 100644
index 0000000..ea36c3a
--- /dev/null
+++ b/arch/arm/crypto/speck-neon-glue.c
@@ -0,0 +1,314 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
+ *
+ * Copyright (c) 2018 Google, Inc
+ *
+ * Note: the NIST recommendation for XTS only specifies a 128-bit block size,
+ * but a 64-bit version (needed for Speck64) is fairly straightforward; the math
+ * is just done in GF(2^64) instead of GF(2^128), with the reducing polynomial
+ * x^64 + x^4 + x^3 + x + 1 from the original XEX paper (Rogaway, 2004:
+ * "Efficient Instantiations of Tweakable Blockciphers and Refinements to Modes
+ * OCB and PMAC"), represented as 0x1B.
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <crypto/algapi.h>
+#include <crypto/gf128mul.h>
+#include <crypto/speck.h>
+#include <crypto/xts.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+/* The assembly functions only handle multiples of 128 bytes */
+#define SPECK_NEON_CHUNK_SIZE 128
+
+/* Speck128 */
+
+struct speck128_xts_tfm_ctx {
+ struct speck128_tfm_ctx main_key;
+ struct speck128_tfm_ctx tweak_key;
+};
+
+asmlinkage void speck128_xts_encrypt_neon(const u64 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+asmlinkage void speck128_xts_decrypt_neon(const u64 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+typedef void (*speck128_crypt_one_t)(const struct speck128_tfm_ctx *,
+ u8 *, const u8 *);
+typedef void (*speck128_xts_crypt_many_t)(const u64 *, int, void *,
+ const void *, unsigned int, void *);
+
+static __always_inline int
+__speck128_xts_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
+ struct scatterlist *src, unsigned int nbytes,
+ speck128_crypt_one_t crypt_one,
+ speck128_xts_crypt_many_t crypt_many)
+{
+ struct crypto_blkcipher *tfm = desc->tfm;
+ const struct speck128_xts_tfm_ctx *ctx = crypto_blkcipher_ctx(tfm);
+ struct blkcipher_walk walk;
+ le128 tweak;
+ int err;
+
+ blkcipher_walk_init(&walk, dst, src, nbytes);
+ err = blkcipher_walk_virt_block(desc, &walk, SPECK_NEON_CHUNK_SIZE);
+
+ crypto_speck128_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
+
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+ u8 *dst = walk.dst.virt.addr;
+ const u8 *src = walk.src.virt.addr;
+
+ if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
+ unsigned int count;
+
+ count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
+ kernel_neon_begin();
+ (*crypt_many)(ctx->main_key.round_keys,
+ ctx->main_key.nrounds,
+ dst, src, count, &tweak);
+ kernel_neon_end();
+ dst += count;
+ src += count;
+ nbytes -= count;
+ }
+
+ /* Handle any remainder with generic code */
+ while (nbytes >= sizeof(tweak)) {
+ le128_xor((le128 *)dst, (const le128 *)src, &tweak);
+ (*crypt_one)(&ctx->main_key, dst, dst);
+ le128_xor((le128 *)dst, (const le128 *)dst, &tweak);
+ gf128mul_x_ble((be128 *)&tweak, (const be128 *)&tweak);
+
+ dst += sizeof(tweak);
+ src += sizeof(tweak);
+ nbytes -= sizeof(tweak);
+ }
+ err = blkcipher_walk_done(desc, &walk, nbytes);
+ }
+
+ return err;
+}
+
+static int speck128_xts_encrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst,
+ struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck128_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck128_encrypt,
+ speck128_xts_encrypt_neon);
+}
+
+static int speck128_xts_decrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst,
+ struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck128_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck128_decrypt,
+ speck128_xts_decrypt_neon);
+}
+
+static int speck128_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ struct speck128_xts_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
+ int err;
+
+ if (keylen % 2)
+ return -EINVAL;
+
+ keylen /= 2;
+
+ err = crypto_speck128_setkey(&ctx->main_key, key, keylen);
+ if (err)
+ return err;
+
+ return crypto_speck128_setkey(&ctx->tweak_key, key + keylen, keylen);
+}
+
+/* Speck64 */
+
+struct speck64_xts_tfm_ctx {
+ struct speck64_tfm_ctx main_key;
+ struct speck64_tfm_ctx tweak_key;
+};
+
+asmlinkage void speck64_xts_encrypt_neon(const u32 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+asmlinkage void speck64_xts_decrypt_neon(const u32 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+typedef void (*speck64_crypt_one_t)(const struct speck64_tfm_ctx *,
+ u8 *, const u8 *);
+typedef void (*speck64_xts_crypt_many_t)(const u32 *, int, void *,
+ const void *, unsigned int, void *);
+
+static __always_inline int
+__speck64_xts_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
+ struct scatterlist *src, unsigned int nbytes,
+ speck64_crypt_one_t crypt_one,
+ speck64_xts_crypt_many_t crypt_many)
+{
+ struct crypto_blkcipher *tfm = desc->tfm;
+ const struct speck64_xts_tfm_ctx *ctx = crypto_blkcipher_ctx(tfm);
+ struct blkcipher_walk walk;
+ __le64 tweak;
+ int err;
+
+ blkcipher_walk_init(&walk, dst, src, nbytes);
+ err = blkcipher_walk_virt_block(desc, &walk, SPECK_NEON_CHUNK_SIZE);
+
+ crypto_speck64_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
+
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+ u8 *dst = walk.dst.virt.addr;
+ const u8 *src = walk.src.virt.addr;
+
+ if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
+ unsigned int count;
+
+ count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
+ kernel_neon_begin();
+ (*crypt_many)(ctx->main_key.round_keys,
+ ctx->main_key.nrounds,
+ dst, src, count, &tweak);
+ kernel_neon_end();
+ dst += count;
+ src += count;
+ nbytes -= count;
+ }
+
+ /* Handle any remainder with generic code */
+ while (nbytes >= sizeof(tweak)) {
+ *(__le64 *)dst = *(__le64 *)src ^ tweak;
+ (*crypt_one)(&ctx->main_key, dst, dst);
+ *(__le64 *)dst ^= tweak;
+ tweak = cpu_to_le64((le64_to_cpu(tweak) << 1) ^
+ ((tweak & cpu_to_le64(1ULL << 63)) ?
+ 0x1B : 0));
+ dst += sizeof(tweak);
+ src += sizeof(tweak);
+ nbytes -= sizeof(tweak);
+ }
+ err = blkcipher_walk_done(desc, &walk, nbytes);
+ }
+
+ return err;
+}
+
+static int speck64_xts_encrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst, struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck64_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck64_encrypt,
+ speck64_xts_encrypt_neon);
+}
+
+static int speck64_xts_decrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst, struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck64_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck64_decrypt,
+ speck64_xts_decrypt_neon);
+}
+
+static int speck64_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ struct speck64_xts_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
+ int err;
+
+ if (keylen % 2)
+ return -EINVAL;
+
+ keylen /= 2;
+
+ err = crypto_speck64_setkey(&ctx->main_key, key, keylen);
+ if (err)
+ return err;
+
+ return crypto_speck64_setkey(&ctx->tweak_key, key + keylen, keylen);
+}
+
+static struct crypto_alg speck_algs[] = {
+ {
+ .cra_name = "xts(speck128)",
+ .cra_driver_name = "xts-speck128-neon",
+ .cra_priority = 300,
+ .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
+ .cra_blocksize = SPECK128_BLOCK_SIZE,
+ .cra_type = &crypto_blkcipher_type,
+ .cra_ctxsize = sizeof(struct speck128_xts_tfm_ctx),
+ .cra_alignmask = 7,
+ .cra_module = THIS_MODULE,
+ .cra_u = {
+ .blkcipher = {
+ .min_keysize = 2 * SPECK128_128_KEY_SIZE,
+ .max_keysize = 2 * SPECK128_256_KEY_SIZE,
+ .ivsize = SPECK128_BLOCK_SIZE,
+ .setkey = speck128_xts_setkey,
+ .encrypt = speck128_xts_encrypt,
+ .decrypt = speck128_xts_decrypt,
+ }
+ }
+ }, {
+ .cra_name = "xts(speck64)",
+ .cra_driver_name = "xts-speck64-neon",
+ .cra_priority = 300,
+ .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
+ .cra_blocksize = SPECK64_BLOCK_SIZE,
+ .cra_type = &crypto_blkcipher_type,
+ .cra_ctxsize = sizeof(struct speck64_xts_tfm_ctx),
+ .cra_alignmask = 7,
+ .cra_module = THIS_MODULE,
+ .cra_u = {
+ .blkcipher = {
+ .min_keysize = 2 * SPECK64_96_KEY_SIZE,
+ .max_keysize = 2 * SPECK64_128_KEY_SIZE,
+ .ivsize = SPECK64_BLOCK_SIZE,
+ .setkey = speck64_xts_setkey,
+ .encrypt = speck64_xts_encrypt,
+ .decrypt = speck64_xts_decrypt,
+ }
+ }
+ }
+};
+
+static int __init speck_neon_module_init(void)
+{
+ if (!(elf_hwcap & HWCAP_NEON))
+ return -ENODEV;
+ return crypto_register_algs(speck_algs, ARRAY_SIZE(speck_algs));
+}
+
+static void __exit speck_neon_module_exit(void)
+{
+ crypto_unregister_algs(speck_algs, ARRAY_SIZE(speck_algs));
+}
+
+module_init(speck_neon_module_init);
+module_exit(speck_neon_module_exit);
+
+MODULE_DESCRIPTION("Speck block cipher (NEON-accelerated)");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_ALIAS_CRYPTO("xts(speck128)");
+MODULE_ALIAS_CRYPTO("xts-speck128-neon");
+MODULE_ALIAS_CRYPTO("xts(speck64)");
+MODULE_ALIAS_CRYPTO("xts-speck64-neon");
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index d5525bf..9156fc3 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -491,7 +491,6 @@
#endif
#ifdef CONFIG_DEBUG_RODATA
-void mark_rodata_ro(void);
void set_kernel_text_rw(void);
void set_kernel_text_ro(void);
#else
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index f13ae15..d2315ff 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -112,8 +112,12 @@
#define CORE_DUMP_USE_REGSET
#define ELF_EXEC_PAGESIZE 4096
-/* This is the base location for PIE (ET_DYN with INTERP) loads. */
-#define ELF_ET_DYN_BASE 0x400000UL
+/* This is the location that an ET_DYN program is loaded if exec'ed. Typical
+ use of this is to invoke "./ld.so someprog" to test out a new version of
+ the loader. We need to make sure that it is out of the way of the program
+ that it will "exec", and that there is sufficient room for the brk. */
+
+#define ELF_ET_DYN_BASE (TASK_SIZE / 3 * 2)
/* When the program starts, a1 contains a pointer to a function to be
registered with atexit, as per the SVR4 ABI. A value of 0 means we
diff --git a/arch/arm/include/asm/exception.h b/arch/arm/include/asm/exception.h
index 5abaf5b..bf19912 100644
--- a/arch/arm/include/asm/exception.h
+++ b/arch/arm/include/asm/exception.h
@@ -7,7 +7,7 @@
#ifndef __ASM_ARM_EXCEPTION_H
#define __ASM_ARM_EXCEPTION_H
-#include <linux/ftrace.h>
+#include <linux/interrupt.h>
#define __exception __attribute__((section(".exception.text")))
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/arch/arm/include/asm/fiq_glue.h b/arch/arm/include/asm/fiq_glue.h
new file mode 100644
index 0000000..a9e244f9
--- /dev/null
+++ b/arch/arm/include/asm/fiq_glue.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ASM_FIQ_GLUE_H
+#define __ASM_FIQ_GLUE_H
+
+struct fiq_glue_handler {
+ void (*fiq)(struct fiq_glue_handler *h, void *regs, void *svc_sp);
+ void (*resume)(struct fiq_glue_handler *h);
+};
+typedef void (*fiq_return_handler_t)(void);
+
+int fiq_glue_register_handler(struct fiq_glue_handler *handler);
+int fiq_glue_set_return_handler(fiq_return_handler_t fiq_return);
+int fiq_glue_clear_return_handler(fiq_return_handler_t fiq_return);
+
+#ifdef CONFIG_FIQ_GLUE
+void fiq_glue_resume(void);
+#else
+static inline void fiq_glue_resume(void) {}
+#endif
+
+#endif
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 194c91b..c35c349 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -79,6 +79,8 @@
#define rr_lo_hi(a1, a2) a1, a2
#endif
+#define kvm_ksym_ref(kva) (kva)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index fd929b5..9459ca8 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -250,6 +250,7 @@
PMD_BIT_FUNC(mksplitting, |= L_PMD_SECT_SPLITTING);
PMD_BIT_FUNC(mkwrite, &= ~L_PMD_SECT_RDONLY);
PMD_BIT_FUNC(mkdirty, |= L_PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mkclean, &= ~L_PMD_SECT_DIRTY);
PMD_BIT_FUNC(mkyoung, |= PMD_SECT_AF);
#define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 370f7a7..d060641 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -3,6 +3,7 @@
#ifdef CONFIG_ARM_CPU_TOPOLOGY
+#include <linux/cpufreq.h>
#include <linux/cpumask.h>
struct cputopo_arm {
@@ -24,6 +25,12 @@
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
+#ifdef CONFIG_CPU_FREQ
+#define arch_scale_freq_capacity cpufreq_scale_freq_capacity
+#endif
+#define arch_scale_cpu_capacity scale_cpu_capacity
+extern unsigned long scale_cpu_capacity(struct sched_domain *sd, int cpu);
+
#else
static inline void init_cpu_topology(void) { }
diff --git a/arch/arm/include/asm/traps.h b/arch/arm/include/asm/traps.h
index f555bb3..683d923 100644
--- a/arch/arm/include/asm/traps.h
+++ b/arch/arm/include/asm/traps.h
@@ -18,7 +18,6 @@
void register_undef_hook(struct undef_hook *hook);
void unregister_undef_hook(struct undef_hook *hook);
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static inline int __in_irqentry_text(unsigned long ptr)
{
extern char __irqentry_text_start[];
@@ -27,12 +26,6 @@
return ptr >= (unsigned long)&__irqentry_text_start &&
ptr < (unsigned long)&__irqentry_text_end;
}
-#else
-static inline int __in_irqentry_text(unsigned long ptr)
-{
- return 0;
-}
-#endif
static inline int in_exception_text(unsigned long ptr)
{
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index cd8b589..7665bd2 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -496,7 +496,10 @@
static inline unsigned long __must_check
__copy_from_user(void *to, const void __user *from, unsigned long n)
{
- unsigned int __ua_flags = uaccess_save_and_enable();
+ unsigned int __ua_flags;
+
+ check_object_size(to, n, false);
+ __ua_flags = uaccess_save_and_enable();
n = arm_copy_from_user(to, from, n);
uaccess_restore(__ua_flags);
return n;
@@ -511,11 +514,15 @@
__copy_to_user(void __user *to, const void *from, unsigned long n)
{
#ifndef CONFIG_UACCESS_WITH_MEMCPY
- unsigned int __ua_flags = uaccess_save_and_enable();
+ unsigned int __ua_flags;
+
+ check_object_size(from, n, true);
+ __ua_flags = uaccess_save_and_enable();
n = arm_copy_to_user(to, from, n);
uaccess_restore(__ua_flags);
return n;
#else
+ check_object_size(from, n, true);
return arm_copy_to_user(to, from, n);
#endif
}
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 3c78949..82bdac0 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -87,8 +87,9 @@
obj-$(CONFIG_ARM_VIRT_EXT) += hyp-stub.o
ifeq ($(CONFIG_ARM_PSCI),y)
-obj-y += psci-call.o
obj-$(CONFIG_SMP) += psci_smp.o
endif
+obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o
+
extra-y := $(head-y) vmlinux.lds
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index f89811f..7e45f69 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -16,6 +16,7 @@
#include <linux/syscalls.h>
#include <linux/uaccess.h>
#include <linux/io.h>
+#include <linux/arm-smccc.h>
#include <asm/checksum.h>
#include <asm/ftrace.h>
@@ -175,3 +176,8 @@
EXPORT_SYMBOL(__pv_phys_pfn_offset);
EXPORT_SYMBOL(__pv_offset);
#endif
+
+#ifdef CONFIG_HAVE_ARM_SMCCC
+EXPORT_SYMBOL(arm_smccc_smc);
+EXPORT_SYMBOL(arm_smccc_hvc);
+#endif
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 871b826..a586cfe 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -10,7 +10,6 @@
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
-#include <linux/compiler.h>
#include <linux/sched.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
@@ -41,19 +40,10 @@
* GCC 3.2.x: miscompiles NEW_AUX_ENT in fs/binfmt_elf.c
* (http://gcc.gnu.org/PR8896) and incorrect structure
* initialisation in fs/jffs2/erase.c
- * GCC 4.8.0-4.8.2: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854
- * miscompiles find_get_entry(), and can result in EXT3 and EXT4
- * filesystem corruption (possibly other FS too).
*/
-#ifdef __GNUC__
#if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
#error Your compiler is too buggy; it is known to miscompile kernels.
-#error Known good compilers: 3.3, 4.x
-#endif
-#if GCC_VERSION >= 40800 && GCC_VERSION < 40803
-#error Your compiler is too buggy; it is known to miscompile kernels
-#error and result in filesystem corruption and oopses.
-#endif
+#error Known good compilers: 3.3
#endif
int main(void)
diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c
index 9232cae..f3c6622 100644
--- a/arch/arm/kernel/kgdb.c
+++ b/arch/arm/kernel/kgdb.c
@@ -140,6 +140,8 @@
static int kgdb_brk_fn(struct pt_regs *regs, unsigned int instr)
{
+ if (user_mode(regs))
+ return -1;
kgdb_handle_exception(1, SIGTRAP, 0, regs);
return 0;
@@ -147,6 +149,8 @@
static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int instr)
{
+ if (user_mode(regs))
+ return -1;
compiled_break = 1;
kgdb_handle_exception(1, SIGTRAP, 0, regs);
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 4adfb46..e06307a 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -80,6 +80,7 @@
void arch_cpu_idle_enter(void)
{
+ idle_notifier_call_chain(IDLE_START);
ledtrig_cpu(CPU_LED_IDLE_START);
#ifdef CONFIG_PL310_ERRATA_769419
wmb();
@@ -89,6 +90,78 @@
void arch_cpu_idle_exit(void)
{
ledtrig_cpu(CPU_LED_IDLE_END);
+ idle_notifier_call_chain(IDLE_END);
+}
+
+/*
+ * dump a block of kernel memory from around the given address
+ */
+static void show_data(unsigned long addr, int nbytes, const char *name)
+{
+ int i, j;
+ int nlines;
+ u32 *p;
+
+ /*
+ * don't attempt to dump non-kernel addresses or
+ * values that are probably just small negative numbers
+ */
+ if (addr < PAGE_OFFSET || addr > -256UL)
+ return;
+
+ printk("\n%s: %#lx:\n", name, addr);
+
+ /*
+ * round address down to a 32 bit boundary
+ * and always dump a multiple of 32 bytes
+ */
+ p = (u32 *)(addr & ~(sizeof(u32) - 1));
+ nbytes += (addr & (sizeof(u32) - 1));
+ nlines = (nbytes + 31) / 32;
+
+
+ for (i = 0; i < nlines; i++) {
+ /*
+ * just display low 16 bits of address to keep
+ * each line of the dump < 80 characters
+ */
+ printk("%04lx ", (unsigned long)p & 0xffff);
+ for (j = 0; j < 8; j++) {
+ u32 data;
+ if (probe_kernel_address(p, data)) {
+ printk(" ********");
+ } else {
+ printk(" %08x", data);
+ }
+ ++p;
+ }
+ printk("\n");
+ }
+}
+
+static void show_extra_register_data(struct pt_regs *regs, int nbytes)
+{
+ mm_segment_t fs;
+
+ fs = get_fs();
+ set_fs(KERNEL_DS);
+ show_data(regs->ARM_pc - nbytes, nbytes * 2, "PC");
+ show_data(regs->ARM_lr - nbytes, nbytes * 2, "LR");
+ show_data(regs->ARM_sp - nbytes, nbytes * 2, "SP");
+ show_data(regs->ARM_ip - nbytes, nbytes * 2, "IP");
+ show_data(regs->ARM_fp - nbytes, nbytes * 2, "FP");
+ show_data(regs->ARM_r0 - nbytes, nbytes * 2, "R0");
+ show_data(regs->ARM_r1 - nbytes, nbytes * 2, "R1");
+ show_data(regs->ARM_r2 - nbytes, nbytes * 2, "R2");
+ show_data(regs->ARM_r3 - nbytes, nbytes * 2, "R3");
+ show_data(regs->ARM_r4 - nbytes, nbytes * 2, "R4");
+ show_data(regs->ARM_r5 - nbytes, nbytes * 2, "R5");
+ show_data(regs->ARM_r6 - nbytes, nbytes * 2, "R6");
+ show_data(regs->ARM_r7 - nbytes, nbytes * 2, "R7");
+ show_data(regs->ARM_r8 - nbytes, nbytes * 2, "R8");
+ show_data(regs->ARM_r9 - nbytes, nbytes * 2, "R9");
+ show_data(regs->ARM_r10 - nbytes, nbytes * 2, "R10");
+ set_fs(fs);
}
void __show_regs(struct pt_regs *regs)
@@ -178,6 +251,8 @@
printk("Control: %08x%s\n", ctrl, buf);
}
#endif
+
+ show_extra_register_data(regs, 128);
}
void show_regs(struct pt_regs * regs)
@@ -193,9 +268,9 @@
/*
* Free current thread data structures etc..
*/
-void exit_thread(void)
+void exit_thread(struct task_struct *tsk)
{
- thread_notify(THREAD_NOTIFY_EXIT, current_thread_info());
+ thread_notify(THREAD_NOTIFY_EXIT, task_thread_info(tsk));
}
void flush_thread(void)
diff --git a/arch/arm/kernel/psci-call.S b/arch/arm/kernel/psci-call.S
deleted file mode 100644
index a78e9e1..0000000
--- a/arch/arm/kernel/psci-call.S
+++ /dev/null
@@ -1,31 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * Copyright (C) 2015 ARM Limited
- *
- * Author: Mark Rutland <mark.rutland@arm.com>
- */
-
-#include <linux/linkage.h>
-
-#include <asm/opcodes-sec.h>
-#include <asm/opcodes-virt.h>
-
-/* int __invoke_psci_fn_hvc(u32 function_id, u32 arg0, u32 arg1, u32 arg2) */
-ENTRY(__invoke_psci_fn_hvc)
- __HVC(0)
- bx lr
-ENDPROC(__invoke_psci_fn_hvc)
-
-/* int __invoke_psci_fn_smc(u32 function_id, u32 arg0, u32 arg1, u32 arg2) */
-ENTRY(__invoke_psci_fn_smc)
- __SMC(0)
- bx lr
-ENDPROC(__invoke_psci_fn_smc)
diff --git a/arch/arm/kernel/reboot.c b/arch/arm/kernel/reboot.c
index 3826935..1a06da8 100644
--- a/arch/arm/kernel/reboot.c
+++ b/arch/arm/kernel/reboot.c
@@ -6,6 +6,7 @@
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
+#include <linux/console.h>
#include <linux/cpu.h>
#include <linux/delay.h>
#include <linux/reboot.h>
@@ -124,6 +125,31 @@
pm_power_off();
}
+#ifdef CONFIG_ARM_FLUSH_CONSOLE_ON_RESTART
+void arm_machine_flush_console(void)
+{
+ printk("\n");
+ pr_emerg("Restarting %s\n", linux_banner);
+ if (console_trylock()) {
+ console_unlock();
+ return;
+ }
+
+ mdelay(50);
+
+ local_irq_disable();
+ if (!console_trylock())
+ pr_emerg("arm_restart: Console was locked! Busting\n");
+ else
+ pr_emerg("arm_restart: Console was locked!\n");
+ console_unlock();
+}
+#else
+void arm_machine_flush_console(void)
+{
+}
+#endif
+
/*
* Restart requires that the secondary CPUs stop performing any activity
* while the primary CPU resets the system. Systems with a single CPU can
@@ -140,6 +166,10 @@
local_irq_disable();
smp_send_stop();
+ /* Flush the console to make sure all the relevant messages make it
+ * out to the console drivers */
+ arm_machine_flush_console();
+
if (arm_pm_restart)
arm_pm_restart(reboot_mode, cmd);
else
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 20edd34..bf63b46 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -772,7 +772,7 @@
struct resource *res;
kernel_code.start = virt_to_phys(_text);
- kernel_code.end = virt_to_phys(_etext - 1);
+ kernel_code.end = virt_to_phys(__init_begin - 1);
kernel_data.start = virt_to_phys(_sdata);
kernel_data.end = virt_to_phys(_end - 1);
diff --git a/arch/arm/kernel/smccc-call.S b/arch/arm/kernel/smccc-call.S
new file mode 100644
index 0000000..2e48b67
--- /dev/null
+++ b/arch/arm/kernel/smccc-call.S
@@ -0,0 +1,62 @@
+/*
+ * Copyright (c) 2015, Linaro Limited
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+#include <linux/linkage.h>
+
+#include <asm/opcodes-sec.h>
+#include <asm/opcodes-virt.h>
+#include <asm/unwind.h>
+
+ /*
+ * Wrap c macros in asm macros to delay expansion until after the
+ * SMCCC asm macro is expanded.
+ */
+ .macro SMCCC_SMC
+ __SMC(0)
+ .endm
+
+ .macro SMCCC_HVC
+ __HVC(0)
+ .endm
+
+ .macro SMCCC instr
+UNWIND( .fnstart)
+ mov r12, sp
+ push {r4-r7}
+UNWIND( .save {r4-r7})
+ ldm r12, {r4-r7}
+ \instr
+ pop {r4-r7}
+ ldr r12, [sp, #(4 * 4)]
+ stm r12, {r0-r3}
+ bx lr
+UNWIND( .fnend)
+ .endm
+
+/*
+ * void smccc_smc(unsigned long a0, unsigned long a1, unsigned long a2,
+ * unsigned long a3, unsigned long a4, unsigned long a5,
+ * unsigned long a6, unsigned long a7, struct arm_smccc_res *res)
+ */
+ENTRY(arm_smccc_smc)
+ SMCCC SMCCC_SMC
+ENDPROC(arm_smccc_smc)
+
+/*
+ * void smccc_hvc(unsigned long a0, unsigned long a1, unsigned long a2,
+ * unsigned long a3, unsigned long a4, unsigned long a5,
+ * unsigned long a6, unsigned long a7, struct arm_smccc_res *res)
+ */
+ENTRY(arm_smccc_hvc)
+ SMCCC SMCCC_HVC
+ENDPROC(arm_smccc_hvc)
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 08b7847..4f2c51e 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -42,9 +42,15 @@
*/
static DEFINE_PER_CPU(unsigned long, cpu_scale);
-unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
+unsigned long scale_cpu_capacity(struct sched_domain *sd, int cpu)
{
+#ifdef CONFIG_CPU_FREQ
+ unsigned long max_freq_scale = cpufreq_scale_max_freq_capacity(cpu);
+
+ return per_cpu(cpu_scale, cpu) * max_freq_scale >> SCHED_CAPACITY_SHIFT;
+#else
return per_cpu(cpu_scale, cpu);
+#endif
}
static void set_capacity_scale(unsigned int cpu, unsigned long capacity)
@@ -153,6 +159,8 @@
}
+static const struct sched_group_energy * const cpu_core_energy(int cpu);
+
/*
* Look for a customed capacity of a CPU in the cpu_capacity table during the
* boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
@@ -160,10 +168,14 @@
*/
static void update_cpu_capacity(unsigned int cpu)
{
- if (!cpu_capacity(cpu))
- return;
+ unsigned long capacity = SCHED_CAPACITY_SCALE;
- set_capacity_scale(cpu, cpu_capacity(cpu) / middle_capacity);
+ if (cpu_core_energy(cpu)) {
+ int max_cap_idx = cpu_core_energy(cpu)->nr_cap_states - 1;
+ capacity = cpu_core_energy(cpu)->cap_states[max_cap_idx].cap;
+ }
+
+ set_capacity_scale(cpu, capacity);
pr_info("CPU%u: update cpu_capacity %lu\n",
cpu, arch_scale_cpu_capacity(NULL, cpu));
@@ -275,17 +287,138 @@
cpu_topology[cpuid].socket_id, mpidr);
}
+/*
+ * ARM TC2 specific energy cost model data. There are no unit requirements for
+ * the data. Data can be normalized to any reference point, but the
+ * normalization must be consistent. That is, one bogo-joule/watt must be the
+ * same quantity for all data, but we don't care what it is.
+ */
+static struct idle_state idle_states_cluster_a7[] = {
+ { .power = 25 }, /* arch_cpu_idle() (active idle) = WFI */
+ { .power = 25 }, /* WFI */
+ { .power = 10 }, /* cluster-sleep-l */
+ };
+
+static struct idle_state idle_states_cluster_a15[] = {
+ { .power = 70 }, /* arch_cpu_idle() (active idle) = WFI */
+ { .power = 70 }, /* WFI */
+ { .power = 25 }, /* cluster-sleep-b */
+ };
+
+static struct capacity_state cap_states_cluster_a7[] = {
+ /* Cluster only power */
+ { .cap = 150, .power = 2967, }, /* 350 MHz */
+ { .cap = 172, .power = 2792, }, /* 400 MHz */
+ { .cap = 215, .power = 2810, }, /* 500 MHz */
+ { .cap = 258, .power = 2815, }, /* 600 MHz */
+ { .cap = 301, .power = 2919, }, /* 700 MHz */
+ { .cap = 344, .power = 2847, }, /* 800 MHz */
+ { .cap = 387, .power = 3917, }, /* 900 MHz */
+ { .cap = 430, .power = 4905, }, /* 1000 MHz */
+ };
+
+static struct capacity_state cap_states_cluster_a15[] = {
+ /* Cluster only power */
+ { .cap = 426, .power = 7920, }, /* 500 MHz */
+ { .cap = 512, .power = 8165, }, /* 600 MHz */
+ { .cap = 597, .power = 8172, }, /* 700 MHz */
+ { .cap = 682, .power = 8195, }, /* 800 MHz */
+ { .cap = 768, .power = 8265, }, /* 900 MHz */
+ { .cap = 853, .power = 8446, }, /* 1000 MHz */
+ { .cap = 938, .power = 11426, }, /* 1100 MHz */
+ { .cap = 1024, .power = 15200, }, /* 1200 MHz */
+ };
+
+static struct sched_group_energy energy_cluster_a7 = {
+ .nr_idle_states = ARRAY_SIZE(idle_states_cluster_a7),
+ .idle_states = idle_states_cluster_a7,
+ .nr_cap_states = ARRAY_SIZE(cap_states_cluster_a7),
+ .cap_states = cap_states_cluster_a7,
+};
+
+static struct sched_group_energy energy_cluster_a15 = {
+ .nr_idle_states = ARRAY_SIZE(idle_states_cluster_a15),
+ .idle_states = idle_states_cluster_a15,
+ .nr_cap_states = ARRAY_SIZE(cap_states_cluster_a15),
+ .cap_states = cap_states_cluster_a15,
+};
+
+static struct idle_state idle_states_core_a7[] = {
+ { .power = 0 }, /* arch_cpu_idle (active idle) = WFI */
+ { .power = 0 }, /* WFI */
+ { .power = 0 }, /* cluster-sleep-l */
+ };
+
+static struct idle_state idle_states_core_a15[] = {
+ { .power = 0 }, /* arch_cpu_idle (active idle) = WFI */
+ { .power = 0 }, /* WFI */
+ { .power = 0 }, /* cluster-sleep-b */
+ };
+
+static struct capacity_state cap_states_core_a7[] = {
+ /* Power per cpu */
+ { .cap = 150, .power = 187, }, /* 350 MHz */
+ { .cap = 172, .power = 275, }, /* 400 MHz */
+ { .cap = 215, .power = 334, }, /* 500 MHz */
+ { .cap = 258, .power = 407, }, /* 600 MHz */
+ { .cap = 301, .power = 447, }, /* 700 MHz */
+ { .cap = 344, .power = 549, }, /* 800 MHz */
+ { .cap = 387, .power = 761, }, /* 900 MHz */
+ { .cap = 430, .power = 1024, }, /* 1000 MHz */
+ };
+
+static struct capacity_state cap_states_core_a15[] = {
+ /* Power per cpu */
+ { .cap = 426, .power = 2021, }, /* 500 MHz */
+ { .cap = 512, .power = 2312, }, /* 600 MHz */
+ { .cap = 597, .power = 2756, }, /* 700 MHz */
+ { .cap = 682, .power = 3125, }, /* 800 MHz */
+ { .cap = 768, .power = 3524, }, /* 900 MHz */
+ { .cap = 853, .power = 3846, }, /* 1000 MHz */
+ { .cap = 938, .power = 5177, }, /* 1100 MHz */
+ { .cap = 1024, .power = 6997, }, /* 1200 MHz */
+ };
+
+static struct sched_group_energy energy_core_a7 = {
+ .nr_idle_states = ARRAY_SIZE(idle_states_core_a7),
+ .idle_states = idle_states_core_a7,
+ .nr_cap_states = ARRAY_SIZE(cap_states_core_a7),
+ .cap_states = cap_states_core_a7,
+};
+
+static struct sched_group_energy energy_core_a15 = {
+ .nr_idle_states = ARRAY_SIZE(idle_states_core_a15),
+ .idle_states = idle_states_core_a15,
+ .nr_cap_states = ARRAY_SIZE(cap_states_core_a15),
+ .cap_states = cap_states_core_a15,
+};
+
+/* sd energy functions */
+static inline
+const struct sched_group_energy * const cpu_cluster_energy(int cpu)
+{
+ return cpu_topology[cpu].socket_id ? &energy_cluster_a7 :
+ &energy_cluster_a15;
+}
+
+static inline
+const struct sched_group_energy * const cpu_core_energy(int cpu)
+{
+ return cpu_topology[cpu].socket_id ? &energy_core_a7 :
+ &energy_core_a15;
+}
+
static inline int cpu_corepower_flags(void)
{
- return SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN;
+ return SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN | \
+ SD_SHARE_CAP_STATES;
}
static struct sched_domain_topology_level arm_topology[] = {
#ifdef CONFIG_SCHED_MC
- { cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
- { cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
+ { cpu_coregroup_mask, cpu_corepower_flags, cpu_core_energy, SD_INIT_NAME(MC) },
#endif
- { cpu_cpu_mask, SD_INIT_NAME(DIE) },
+ { cpu_cpu_mask, NULL, cpu_cluster_energy, SD_INIT_NAME(DIE) },
{ NULL, },
};
diff --git a/arch/arm/kernel/vdso.c b/arch/arm/kernel/vdso.c
index 2dee872..6f7772f 100644
--- a/arch/arm/kernel/vdso.c
+++ b/arch/arm/kernel/vdso.c
@@ -17,6 +17,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#include <linux/cache.h>
#include <linux/elf.h>
#include <linux/err.h>
#include <linux/kernel.h>
@@ -41,7 +42,7 @@
extern char vdso_start[], vdso_end[];
/* Total number of pages needed for the data and text portions of the VDSO. */
-unsigned int vdso_total_pages __read_mostly;
+unsigned int vdso_total_pages __ro_after_init;
/*
* The VDSO data page.
@@ -49,13 +50,13 @@
static union vdso_data_store vdso_data_store __page_aligned_data;
static struct vdso_data *vdso_data = &vdso_data_store.data;
-static struct page *vdso_data_page;
-static struct vm_special_mapping vdso_data_mapping = {
+static struct page *vdso_data_page __ro_after_init;
+static const struct vm_special_mapping vdso_data_mapping = {
.name = "[vvar]",
.pages = &vdso_data_page,
};
-static struct vm_special_mapping vdso_text_mapping = {
+static struct vm_special_mapping vdso_text_mapping __ro_after_init = {
.name = "[vdso]",
};
@@ -69,7 +70,7 @@
/* Cached result of boot-time check for whether the arch timer exists,
* and if so, whether the virtual counter is useable.
*/
-static bool cntvct_ok __read_mostly;
+static bool cntvct_ok __ro_after_init;
static bool __init cntvct_functional(void)
{
@@ -226,7 +227,7 @@
VM_READ | VM_MAYREAD,
&vdso_data_mapping);
- return IS_ERR(vma) ? PTR_ERR(vma) : 0;
+ return PTR_ERR_OR_ZERO(vma);
}
/* assumes mmap_sem is write-locked */
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 8b60fde..b2e2344 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -105,6 +105,7 @@
*(.exception.text)
__exception_text_end = .;
IRQENTRY_TEXT
+ SOFTIRQENTRY_TEXT
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
@@ -120,6 +121,8 @@
#ifdef CONFIG_DEBUG_RODATA
. = ALIGN(1<<SECTION_SHIFT);
#endif
+ _etext = .; /* End of text section */
+
RO_DATA(PAGE_SIZE)
. = ALIGN(4);
@@ -150,8 +153,6 @@
NOTES
- _etext = .; /* End of text and rodata section */
-
#ifndef CONFIG_XIP_KERNEL
# ifdef CONFIG_ARM_KERNMEM_PERMS
. = ALIGN(1<<SECTION_SHIFT);
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index d7bef21..c17cb14 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -967,7 +967,7 @@
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
- vector_ptr = (unsigned long)__kvm_hyp_vector;
+ vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
@@ -1059,7 +1059,8 @@
/*
* Map the Hyp-code called directly from the host
*/
- err = create_hyp_mappings(__kvm_hyp_code_start, __kvm_hyp_code_end);
+ err = create_hyp_mappings(kvm_ksym_ref(__kvm_hyp_code_start),
+ kvm_ksym_ref(__kvm_hyp_code_end));
if (err) {
kvm_err("Cannot map world-switch code\n");
goto out_free_mappings;
diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S
index 2465995..11da0f5 100644
--- a/arch/arm/mm/cache-v6.S
+++ b/arch/arm/mm/cache-v6.S
@@ -270,6 +270,11 @@
* - end - virtual end address of region
*/
ENTRY(v6_dma_flush_range)
+#ifdef CONFIG_CACHE_FLUSH_RANGE_LIMIT
+ sub r2, r1, r0
+ cmp r2, #CONFIG_CACHE_FLUSH_RANGE_LIMIT
+ bhi v6_dma_flush_dcache_all
+#endif
#ifdef CONFIG_DMA_CACHE_RWFO
ldrb r2, [r0] @ read for ownership
strb r2, [r0] @ write for ownership
@@ -292,6 +297,18 @@
mcr p15, 0, r0, c7, c10, 4 @ drain write buffer
ret lr
+#ifdef CONFIG_CACHE_FLUSH_RANGE_LIMIT
+v6_dma_flush_dcache_all:
+ mov r0, #0
+#ifdef HARVARD_CACHE
+ mcr p15, 0, r0, c7, c14, 0 @ D cache clean+invalidate
+#else
+ mcr p15, 0, r0, c7, c15, 0 @ Cache clean+invalidate
+#endif
+ mcr p15, 0, r0, c7, c10, 4 @ drain write buffer
+ mov pc, lr
+#endif
+
/*
* dma_map_area(start, size, dir)
* - start - kernel virtual start address
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 0d20cd5..83519af 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -273,10 +273,10 @@
local_irq_enable();
/*
- * If we're in an interrupt or have no user
+ * If we're in an interrupt, or have no irqs, or have no user
* context, we must not take the fault..
*/
- if (faulthandler_disabled() || !mm)
+ if (faulthandler_disabled() || irqs_disabled() || !mm)
goto no_context;
if (user_mode(regs))
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index c469c06..641334e 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -173,8 +173,7 @@
{
unsigned long rnd;
- /* 8 bits of randomness in 20 address space bits */
- rnd = (unsigned long)get_random_int() % (1 << 8);
+ rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
return rnd << PAGE_SHIFT;
}
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index e47cffd..aead23f 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -572,7 +572,7 @@
* in the Short-descriptor translation table format descriptors.
*/
if (cpu_arch == CPU_ARCH_ARMv7 &&
- (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xF) == 4) {
+ (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xF) >= 4) {
user_pmd_table |= PMD_PXNTABLE;
}
#endif
diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
index 1160434..59a8fa7 100644
--- a/arch/arm/vdso/Makefile
+++ b/arch/arm/vdso/Makefile
@@ -74,5 +74,5 @@
@mkdir -p $(MODLIB)/vdso
PHONY += vdso_install
-vdso_install: $(obj)/vdso.so.dbg $(MODLIB)/vdso FORCE
+vdso_install: $(obj)/vdso.so.dbg $(MODLIB)/vdso
$(call cmd,vdso_install)
diff --git a/arch/arm/vdso/vdso.S b/arch/arm/vdso/vdso.S
index b2b97e3..a62a7b6 100644
--- a/arch/arm/vdso/vdso.S
+++ b/arch/arm/vdso/vdso.S
@@ -23,9 +23,8 @@
#include <linux/const.h>
#include <asm/page.h>
- __PAGE_ALIGNED_DATA
-
.globl vdso_start, vdso_end
+ .section .data..ro_after_init
.balign PAGE_SIZE
vdso_start:
.incbin "arch/arm/vdso/vdso.so"
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 2a61e4b..73085d34 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -156,10 +156,6 @@
* - we could be preempted if tree preempt rcu is enabled, so
* it is unsafe to use thread->cpu.
* THREAD_NOTIFY_EXIT
- * - the thread (v) will be running on the local CPU, so
- * v === current_thread_info()
- * - thread->cpu is the local CPU number at the time it is accessed,
- * but may change at any time.
* - we could be preempted if tree preempt rcu is enabled, so
* it is unsafe to use thread->cpu.
*/
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5b47218..95e8c51 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -13,6 +13,7 @@
select ARCH_WANT_OPTIONAL_GPIOLIB
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select ARCH_WANT_FRAME_POINTERS
+ select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
@@ -48,9 +49,13 @@
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
+ select HAVE_ARCH_HARDENED_USERCOPY
+ select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB
+ select HAVE_ARCH_MMAP_RND_BITS
+ select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_BPF_JIT
@@ -71,6 +76,7 @@
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_GENERIC_DMA_COHERENT
select HAVE_HW_BREAKPOINT if PERF_EVENTS
+ select HAVE_IRQ_TIME_ACCOUNTING
select HAVE_MEMBLOCK
select HAVE_PATA_PLATFORM
select HAVE_PERF_EVENTS
@@ -93,6 +99,8 @@
select SPARSE_IRQ
select SYSCTL_EXCEPTION_TRACE
select HAVE_CONTEXT_TRACKING
+ select THREAD_INFO_IN_TASK
+ select HAVE_ARM_SMCCC
help
ARM 64-bit (AArch64) Linux support.
@@ -105,9 +113,40 @@
config MMU
def_bool y
+config ARCH_MMAP_RND_BITS_MIN
+ default 14 if ARM64_64K_PAGES
+ default 16 if ARM64_16K_PAGES
+ default 18
+
+# max bits determined by the following formula:
+# VA_BITS - PAGE_SHIFT - 3
+config ARCH_MMAP_RND_BITS_MAX
+ default 19 if ARM64_VA_BITS=36
+ default 24 if ARM64_VA_BITS=39
+ default 27 if ARM64_VA_BITS=42
+ default 30 if ARM64_VA_BITS=47
+ default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
+ default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
+ default 33 if ARM64_VA_BITS=48
+ default 14 if ARM64_64K_PAGES
+ default 16 if ARM64_16K_PAGES
+ default 18
+
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+ default 7 if ARM64_64K_PAGES
+ default 9 if ARM64_16K_PAGES
+ default 11
+
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+ default 16
+
config NO_IOPORT_MAP
def_bool y if !PCI
+config ILLEGAL_POINTER_VALUE
+ hex
+ default 0xdead000000000000
+
config STACKTRACE_SUPPORT
def_bool y
@@ -363,6 +402,7 @@
bool "Cortex-A53: 843419: A load or store might access an incorrect address"
depends on MODULES
default y
+ select ARM64_MODULE_CMODEL_LARGE
help
This option builds kernel modules using the large memory model in
order to avoid the use of the ADRP instruction, which can cause
@@ -541,6 +581,9 @@
source kernel/Kconfig.preempt
source kernel/Kconfig.hz
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+ def_bool y
+
config ARCH_HAS_HOLES_MEMORYMODEL
def_bool y if SPARSEMEM
@@ -564,9 +607,6 @@
config SYS_SUPPORTS_HUGETLBFS
def_bool y
-config ARCH_WANT_GENERAL_HUGETLB
- def_bool y
-
config ARCH_WANT_HUGE_PMD_SHARE
def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
@@ -625,6 +665,18 @@
However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
4M allocations matching the default size used by generic code.
+config UNMAP_KERNEL_AT_EL0
+ bool "Unmap kernel when running in userspace (aka \"KAISER\")" if EXPERT
+ default y
+ help
+ Speculation attacks against some high-performance processors can
+ be used to bypass MMU permission checks and leak kernel data to
+ userspace. This can be defended against by unmapping the kernel
+ when running in userspace, mapping it back in on exception entry
+ via a trampoline page in the vector table.
+
+ If unsure, say Y.
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
@@ -692,6 +744,14 @@
If unsure, say Y
endif
+config ARM64_SW_TTBR0_PAN
+ bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
+ help
+ Enabling this option prevents the kernel from accessing
+ user-space memory directly by pointing TTBR0_EL1 to a reserved
+ zeroed area and reserved ASID. The user access routines
+ restore the valid TTBR0_EL1 temporarily.
+
menu "ARMv8.1 architectural features"
config ARM64_HW_AFDBM
@@ -739,10 +799,93 @@
endmenu
+config ARM64_UAO
+ bool "Enable support for User Access Override (UAO)"
+ default y
+ help
+ User Access Override (UAO; part of the ARMv8.2 Extensions)
+ causes the 'unprivileged' variant of the load/store instructions to
+ be overriden to be privileged.
+
+ This option changes get_user() and friends to use the 'unprivileged'
+ variant of the load/store instructions. This ensures that user-space
+ really did have access to the supplied memory. When addr_limit is
+ set to kernel memory the UAO bit will be set, allowing privileged
+ access to kernel memory.
+
+ Choosing this option will cause copy_to_user() et al to use user-space
+ memory permissions.
+
+ The feature is detected at runtime, the kernel will use the
+ regular load/store instructions if the cpu does not implement the
+ feature.
+
+config ARM64_MODULE_CMODEL_LARGE
+ bool
+
+config ARM64_MODULE_PLTS
+ bool
+ select ARM64_MODULE_CMODEL_LARGE
+ select HAVE_MOD_ARCH_SPECIFIC
+
+config RELOCATABLE
+ bool
+ help
+ This builds the kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since AArch64 uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
+config RANDOMIZE_BASE
+ bool "Randomize the address of the kernel image"
+ select ARM64_MODULE_PLTS if MODULES
+ select RELOCATABLE
+ help
+ Randomizes the virtual address at which the kernel image is
+ loaded, as a security feature that deters exploit attempts
+ relying on knowledge of the location of kernel internals.
+
+ It is the bootloader's job to provide entropy, by passing a
+ random u64 value in /chosen/kaslr-seed at kernel entry.
+
+ When booting via the UEFI stub, it will invoke the firmware's
+ EFI_RNG_PROTOCOL implementation (if available) to supply entropy
+ to the kernel proper. In addition, it will randomise the physical
+ location of the kernel Image as well.
+
+ If unsure, say N.
+
+config RANDOMIZE_MODULE_REGION_FULL
+ bool "Randomize the module region independently from the core kernel"
+ depends on RANDOMIZE_BASE && !DYNAMIC_FTRACE
+ default y
+ help
+ Randomizes the location of the module region without considering the
+ location of the core kernel. This way, it is impossible for modules
+ to leak information about the location of core kernel data structures
+ but it does imply that function calls between modules and the core
+ kernel will need to be resolved via veneers in the module PLT.
+
+ When this option is not set, the module region will be randomized over
+ a limited range that contains the [_stext, _etext] interval of the
+ core kernel, so branch relocations are always in range.
+
endmenu
menu "Boot options"
+config ARM64_ACPI_PARKING_PROTOCOL
+ bool "Enable support for the ARM64 ACPI parking protocol"
+ depends on ACPI
+ help
+ Enable support for the ARM64 ACPI parking protocol. If disabled
+ the kernel will not allow booting through the ARM64 ACPI parking
+ protocol even if the corresponding data is present in the ACPI
+ MADT table.
+
config CMDLINE
string "Default kernel command string"
default ""
@@ -751,6 +894,23 @@
entering them here. As a minimum, you should specify the the
root device (e.g. root=/dev/nfs).
+choice
+ prompt "Kernel command line type" if CMDLINE != ""
+ default CMDLINE_FROM_BOOTLOADER
+
+config CMDLINE_FROM_BOOTLOADER
+ bool "Use bootloader kernel arguments if available"
+ help
+ Uses the command-line options passed by the boot loader. If
+ the boot loader doesn't provide any, the default kernel command
+ string provided in CMDLINE will be used.
+
+config CMDLINE_EXTEND
+ bool "Extend bootloader kernel arguments"
+ help
+ The command-line arguments provided by the boot loader will be
+ appended to the default kernel command string.
+
config CMDLINE_FORCE
bool "Always use the default kernel command string"
help
@@ -758,6 +918,7 @@
loader passes other arguments to the kernel.
This is useful if you cannot or don't want to change the
command-line options your boot loader passes to the kernel.
+endchoice
config EFI_STUB
bool
@@ -790,6 +951,41 @@
However, even with this option, the resultant kernel should
continue to boot on existing non-UEFI platforms.
+config BUILD_ARM64_APPENDED_DTB_IMAGE
+ bool "Build a concatenated Image.gz/dtb by default"
+ depends on OF
+ help
+ Enabling this option will cause a concatenated Image.gz and list of
+ DTBs to be built by default (instead of a standalone Image.gz.)
+ The image will built in arch/arm64/boot/Image.gz-dtb
+
+choice
+ prompt "Appended DTB Kernel Image name"
+ depends on BUILD_ARM64_APPENDED_DTB_IMAGE
+ help
+ Enabling this option will cause a specific kernel image Image or
+ Image.gz to be used for final image creation.
+ The image will built in arch/arm64/boot/IMAGE-NAME-dtb
+
+ config IMG_GZ_DTB
+ bool "Image.gz-dtb"
+ config IMG_DTB
+ bool "Image-dtb"
+endchoice
+
+config BUILD_ARM64_APPENDED_KERNEL_IMAGE_NAME
+ string
+ depends on BUILD_ARM64_APPENDED_DTB_IMAGE
+ default "Image.gz-dtb" if IMG_GZ_DTB
+ default "Image-dtb" if IMG_DTB
+
+config BUILD_ARM64_APPENDED_DTB_IMAGE_NAMES
+ string "Default dtb names"
+ depends on BUILD_ARM64_APPENDED_DTB_IMAGE
+ help
+ Space separated list of names of dtbs to append when
+ building a concatenated Image.gz-dtb.
+
endmenu
menu "Userspace binary formats"
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 04fb73b..8b0cd45 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -64,13 +64,13 @@
config DEBUG_RODATA
bool "Make kernel text and rodata read-only"
+ default y
help
If this is set, kernel text and rodata will be made read-only. This
is to help catch accidental or malicious attempts to change the
- kernel's executable code. Additionally splits rodata from kernel
- text so it can be made explicitly non-executable.
+ kernel's executable code.
- If in doubt, say Y
+ If in doubt, say Y
config DEBUG_ALIGN_RODATA
depends on DEBUG_RODATA && ARM64_4K_PAGES
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index b6c90e5..53c0544 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -15,6 +15,10 @@
OBJCOPYFLAGS :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
GZFLAGS :=-9
+ifneq ($(CONFIG_RELOCATABLE),)
+LDFLAGS_vmlinux += -pie -Bsymbolic
+endif
+
KBUILD_DEFCONFIG := defconfig
# Check for binutils support for specific extensions
@@ -26,7 +30,16 @@
endif
endif
-KBUILD_CFLAGS += -mgeneral-regs-only $(lseinstr)
+ifeq ($(cc-name),clang)
+# This is a workaround for https://bugs.llvm.org/show_bug.cgi?id=30792.
+# TODO: revert when this is fixed in LLVM.
+KBUILD_CFLAGS += -mno-implicit-float
+else
+KBUILD_CFLAGS += -mgeneral-regs-only
+endif
+KBUILD_CFLAGS += $(lseinstr)
+KBUILD_CFLAGS += -fno-pic
+KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
KBUILD_CFLAGS += $(call cc-option, -mpc-relative-literal-loads)
KBUILD_AFLAGS += $(lseinstr)
@@ -42,10 +55,14 @@
CHECKFLAGS += -D__aarch64__
-ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
+ifeq ($(CONFIG_ARM64_MODULE_CMODEL_LARGE), y)
KBUILD_CFLAGS_MODULE += -mcmodel=large
endif
+ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
+KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
+endif
+
# Default value
head-y := arch/arm64/kernel/head.o
@@ -56,6 +73,10 @@
TEXT_OFFSET := 0x00080000
endif
+ifeq ($(cc-name),clang)
+KBUILD_CFLAGS += $(call cc-disable-warning, asm-operand-widths)
+endif
+
# KASAN_SHADOW_OFFSET = VA_START + (1 << (VA_BITS - 3)) - (1 << 61)
# in 32-bit arithmetic
KASAN_SHADOW_OFFSET := $(shell printf "0x%08x00000000\n" $$(( \
@@ -74,7 +95,12 @@
core-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
# Default target when executing plain make
+ifeq ($(CONFIG_BUILD_ARM64_APPENDED_DTB_IMAGE),y)
+KBUILD_IMAGE := $(subst $\",,$(CONFIG_BUILD_ARM64_APPENDED_KERNEL_IMAGE_NAME))
+else
KBUILD_IMAGE := Image.gz
+endif
+
KBUILD_DTBS := dtbs
all: $(KBUILD_IMAGE) $(KBUILD_DTBS)
@@ -101,6 +127,9 @@
dtbs_install:
$(Q)$(MAKE) $(dtbinst)=$(boot)/dts
+Image-dtb Image.gz-dtb: vmlinux scripts dtbs
+ $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
+
PHONY += vdso_install
vdso_install:
$(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso $@
@@ -110,6 +139,16 @@
$(Q)$(MAKE) $(clean)=$(boot)
$(Q)$(MAKE) $(clean)=$(boot)/dts
+# We need to generate vdso-offsets.h before compiling certain files in kernel/.
+# In order to do that, we should use the archprepare target, but we can't since
+# asm-offsets.h is included in some files used to generate vdso-offsets.h, and
+# asm-offsets.h is built in prepare0, for which archprepare is a dependency.
+# Therefore we need to generate the header after prepare0 has been made, hence
+# this hack.
+prepare: vdso_prepare
+vdso_prepare: prepare0
+ $(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso include/generated/vdso-offsets.h
+
define archhelp
echo '* Image.gz - Compressed kernel image (arch/$(ARCH)/boot/Image.gz)'
echo ' Image - Uncompressed kernel image (arch/$(ARCH)/boot/Image)'
diff --git a/arch/arm64/boot/.gitignore b/arch/arm64/boot/.gitignore
index 8dab0bb..34e3520 100644
--- a/arch/arm64/boot/.gitignore
+++ b/arch/arm64/boot/.gitignore
@@ -1,2 +1,4 @@
Image
+Image-dtb
Image.gz
+Image.gz-dtb
diff --git a/arch/arm64/boot/Makefile b/arch/arm64/boot/Makefile
index abcbba2..7ab8e74 100644
--- a/arch/arm64/boot/Makefile
+++ b/arch/arm64/boot/Makefile
@@ -14,14 +14,27 @@
# Based on the ia64 boot/Makefile.
#
+include $(srctree)/arch/arm64/boot/dts/Makefile
+
targets := Image Image.gz
+DTB_NAMES := $(subst $\",,$(CONFIG_BUILD_ARM64_APPENDED_DTB_IMAGE_NAMES))
+ifneq ($(DTB_NAMES),)
+DTB_LIST := $(addsuffix .dtb,$(DTB_NAMES))
+else
+DTB_LIST := $(dtb-y)
+endif
+DTB_OBJS := $(addprefix $(obj)/dts/,$(DTB_LIST))
+
$(obj)/Image: vmlinux FORCE
$(call if_changed,objcopy)
$(obj)/Image.bz2: $(obj)/Image FORCE
$(call if_changed,bzip2)
+$(obj)/Image-dtb: $(obj)/Image $(DTB_OBJS) FORCE
+ $(call if_changed,cat)
+
$(obj)/Image.gz: $(obj)/Image FORCE
$(call if_changed,gzip)
@@ -34,6 +47,9 @@
$(obj)/Image.lzo: $(obj)/Image FORCE
$(call if_changed,lzo)
+$(obj)/Image.gz-dtb: $(obj)/Image.gz $(DTB_OBJS) FORCE
+ $(call if_changed,cat)
+
install: $(obj)/Image
$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
$(obj)/Image System.map "$(INSTALL_PATH)"
diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index eb3c42d..062edb2 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -21,3 +21,17 @@
dtb-$(CONFIG_OF_ALL_DTBS) := $(patsubst $(dtstree)/%.dts,%.dtb, $(foreach d,$(dts-dirs), $(wildcard $(dtstree)/$(d)/*.dts)))
always := $(dtb-y)
+
+targets += dtbs
+
+DTB_NAMES := $(subst $\",,$(CONFIG_BUILD_ARM64_APPENDED_DTB_IMAGE_NAMES))
+ifneq ($(DTB_NAMES),)
+DTB_LIST := $(addsuffix .dtb,$(DTB_NAMES))
+else
+DTB_LIST := $(dtb-y)
+endif
+targets += $(DTB_LIST)
+
+dtbs: $(addprefix $(obj)/, $(DTB_LIST))
+
+clean-files := dts/*.dtb *.dtb
diff --git a/arch/arm64/boot/dts/arm/juno-r1.dts b/arch/arm64/boot/dts/arm/juno-r1.dts
index 93bc3d7..29315af 100644
--- a/arch/arm64/boot/dts/arm/juno-r1.dts
+++ b/arch/arm64/boot/dts/arm/juno-r1.dts
@@ -60,6 +60,28 @@
};
};
+ idle-states {
+ entry-method = "arm,psci";
+
+ CPU_SLEEP_0: cpu-sleep-0 {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x0010000>;
+ local-timer-stop;
+ entry-latency-us = <300>;
+ exit-latency-us = <1200>;
+ min-residency-us = <2000>;
+ };
+
+ CLUSTER_SLEEP_0: cluster-sleep-0 {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x1010000>;
+ local-timer-stop;
+ entry-latency-us = <400>;
+ exit-latency-us = <1200>;
+ min-residency-us = <2500>;
+ };
+ };
+
A57_0: cpu@0 {
compatible = "arm,cortex-a57","arm,armv8";
reg = <0x0 0x0>;
@@ -67,6 +89,7 @@
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A57_1: cpu@1 {
@@ -76,6 +99,7 @@
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A53_0: cpu@100 {
@@ -85,6 +109,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A53_1: cpu@101 {
@@ -94,6 +119,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A53_2: cpu@102 {
@@ -103,6 +129,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A53_3: cpu@103 {
@@ -112,6 +139,7 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
};
A57_L2: l2-cache0 {
diff --git a/arch/arm64/boot/dts/arm/juno-sched-energy.dtsi b/arch/arm64/boot/dts/arm/juno-sched-energy.dtsi
new file mode 100644
index 0000000..38207e4
--- /dev/null
+++ b/arch/arm64/boot/dts/arm/juno-sched-energy.dtsi
@@ -0,0 +1,147 @@
+/*
+ * ARM JUNO specific energy cost model data. There are no unit requirements for
+ * the data. Data can be normalized to any reference point, but the
+ * normalization must be consistent. That is, one bogo-joule/watt must be the
+ * same quantity for all data, but we don't care what it is.
+ */
+
+/* static struct idle_state idle_states_cluster_a53[] = { */
+/* { .power = 56 }, /\* arch_cpu_idle() (active idle) = WFI *\/ */
+/* { .power = 56 }, /\* WFI *\/ */
+/* { .power = 56 }, /\* cpu-sleep-0 *\/ */
+/* { .power = 17 }, /\* cluster-sleep-0 *\/ */
+/* }; */
+
+/* static struct idle_state idle_states_cluster_a57[] = { */
+/* { .power = 65 }, /\* arch_cpu_idle() (active idle) = WFI *\/ */
+/* { .power = 65 }, /\* WFI *\/ */
+/* { .power = 65 }, /\* cpu-sleep-0 *\/ */
+/* { .power = 24 }, /\* cluster-sleep-0 *\/ */
+/* }; */
+
+/* static struct capacity_state cap_states_cluster_a53[] = { */
+/* /\* Power per cluster *\/ */
+/* { .cap = 235, .power = 26, }, /\* 450 MHz *\/ */
+/* { .cap = 303, .power = 30, }, /\* 575 MHz *\/ */
+/* { .cap = 368, .power = 39, }, /\* 700 MHz *\/ */
+/* { .cap = 406, .power = 47, }, /\* 775 MHz *\/ */
+/* { .cap = 447, .power = 57, }, /\* 850 Mhz *\/ */
+/* }; */
+
+/* static struct capacity_state cap_states_cluster_a57[] = { */
+/* /\* Power per cluster *\/ */
+/* { .cap = 417, .power = 24, }, /\* 450 MHz *\/ */
+/* { .cap = 579, .power = 32, }, /\* 625 MHz *\/ */
+/* { .cap = 744, .power = 43, }, /\* 800 MHz *\/ */
+/* { .cap = 883, .power = 49, }, /\* 950 MHz *\/ */
+/* { .cap = 1024, .power = 64, }, /\* 1100 MHz *\/ */
+/* }; */
+
+/* static struct sched_group_energy energy_cluster_a53 = { */
+/* .nr_idle_states = ARRAY_SIZE(idle_states_cluster_a53), */
+/* .idle_states = idle_states_cluster_a53, */
+/* .nr_cap_states = ARRAY_SIZE(cap_states_cluster_a53), */
+/* .cap_states = cap_states_cluster_a53, */
+/* }; */
+
+/* static struct sched_group_energy energy_cluster_a57 = { */
+/* .nr_idle_states = ARRAY_SIZE(idle_states_cluster_a57), */
+/* .idle_states = idle_states_cluster_a57, */
+/* .nr_cap_states = ARRAY_SIZE(cap_states_cluster_a57), */
+/* .cap_states = cap_states_cluster_a57, */
+/* }; */
+
+/* static struct idle_state idle_states_core_a53[] = { */
+/* { .power = 6 }, /\* arch_cpu_idle() (active idle) = WFI *\/ */
+/* { .power = 6 }, /\* WFI *\/ */
+/* { .power = 0 }, /\* cpu-sleep-0 *\/ */
+/* { .power = 0 }, /\* cluster-sleep-0 *\/ */
+/* }; */
+
+/* static struct idle_state idle_states_core_a57[] = { */
+/* { .power = 15 }, /\* arch_cpu_idle() (active idle) = WFI *\/ */
+/* { .power = 15 }, /\* WFI *\/ */
+/* { .power = 0 }, /\* cpu-sleep-0 *\/ */
+/* { .power = 0 }, /\* cluster-sleep-0 *\/ */
+/* }; */
+
+/* static struct capacity_state cap_states_core_a53[] = { */
+/* /\* Power per cpu *\/ */
+/* { .cap = 235, .power = 33, }, /\* 450 MHz *\/ */
+/* { .cap = 302, .power = 46, }, /\* 575 MHz *\/ */
+/* { .cap = 368, .power = 61, }, /\* 700 MHz *\/ */
+/* { .cap = 406, .power = 76, }, /\* 775 MHz *\/ */
+/* { .cap = 447, .power = 93, }, /\* 850 Mhz *\/ */
+/* }; */
+
+/* static struct capacity_state cap_states_core_a57[] = { */
+/* /\* Power per cpu *\/ */
+/* { .cap = 417, .power = 168, }, /\* 450 MHz *\/ */
+/* { .cap = 579, .power = 251, }, /\* 625 MHz *\/ */
+/* { .cap = 744, .power = 359, }, /\* 800 MHz *\/ */
+/* { .cap = 883, .power = 479, }, /\* 950 MHz *\/ */
+/* { .cap = 1024, .power = 616, }, /\* 1100 MHz *\/ */
+/* }; */
+
+energy-costs {
+ CPU_COST_A57: core-cost0 {
+ busy-cost-data = <
+ 417 168
+ 579 251
+ 744 359
+ 883 479
+ 1023 616
+ >;
+ idle-cost-data = <
+ 15
+ 15
+ 0
+ 0
+ >;
+ };
+ CPU_COST_A53: core-cost1 {
+ busy-cost-data = <
+ 235 33
+ 302 46
+ 368 61
+ 406 76
+ 447 93
+ >;
+ idle-cost-data = <
+ 6
+ 6
+ 0
+ 0
+ >;
+ };
+ CLUSTER_COST_A57: cluster-cost0 {
+ busy-cost-data = <
+ 417 24
+ 579 32
+ 744 43
+ 883 49
+ 1024 64
+ >;
+ idle-cost-data = <
+ 65
+ 65
+ 65
+ 24
+ >;
+ };
+ CLUSTER_COST_A53: cluster-cost1 {
+ busy-cost-data = <
+ 235 26
+ 303 30
+ 368 39
+ 406 47
+ 447 57
+ >;
+ idle-cost-data = <
+ 56
+ 56
+ 56
+ 17
+ >;
+ };
+};
diff --git a/arch/arm64/boot/dts/arm/juno.dts b/arch/arm64/boot/dts/arm/juno.dts
index 53442b5..a500493 100644
--- a/arch/arm64/boot/dts/arm/juno.dts
+++ b/arch/arm64/boot/dts/arm/juno.dts
@@ -60,6 +60,28 @@
};
};
+ idle-states {
+ entry-method = "arm,psci";
+
+ CPU_SLEEP_0: cpu-sleep-0 {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x0010000>;
+ local-timer-stop;
+ entry-latency-us = <300>;
+ exit-latency-us = <1200>;
+ min-residency-us = <2000>;
+ };
+
+ CLUSTER_SLEEP_0: cluster-sleep-0 {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x1010000>;
+ local-timer-stop;
+ entry-latency-us = <400>;
+ exit-latency-us = <1200>;
+ min-residency-us = <2500>;
+ };
+ };
+
A57_0: cpu@0 {
compatible = "arm,cortex-a57","arm,armv8";
reg = <0x0 0x0>;
@@ -67,6 +89,8 @@
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A57 &CLUSTER_COST_A57>;
};
A57_1: cpu@1 {
@@ -76,6 +100,8 @@
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A57 &CLUSTER_COST_A57>;
};
A53_0: cpu@100 {
@@ -85,6 +111,8 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A53 &CLUSTER_COST_A53>;
};
A53_1: cpu@101 {
@@ -94,6 +122,8 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A53 &CLUSTER_COST_A53>;
};
A53_2: cpu@102 {
@@ -103,6 +133,8 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A53 &CLUSTER_COST_A53>;
};
A53_3: cpu@103 {
@@ -112,6 +144,8 @@
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
+ cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
+ sched-energy-costs = <&CPU_COST_A53 &CLUSTER_COST_A53>;
};
A57_L2: l2-cache0 {
@@ -121,6 +155,8 @@
A53_L2: l2-cache1 {
compatible = "cache";
};
+
+ /include/ "juno-sched-energy.dtsi"
};
pmu_a57 {
diff --git a/arch/arm64/configs/ranchu64_defconfig b/arch/arm64/configs/ranchu64_defconfig
new file mode 100644
index 0000000..7f847fc
--- /dev/null
+++ b/arch/arm64/configs/ranchu64_defconfig
@@ -0,0 +1,359 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+# CONFIG_SWAP is not set
+CONFIG_POSIX_MQUEUE=y
+# CONFIG_USELIB is not set
+CONFIG_AUDIT=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_LOG_BUF_SHIFT=14
+CONFIG_CGROUP_DEBUG=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_NAMESPACES=y
+CONFIG_SCHED_AUTOGROUP=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_KALLSYMS_ALL=y
+CONFIG_EMBEDDED=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_PROFILING=y
+CONFIG_CC_STACKPROTECTOR_STRONG=y
+CONFIG_ARCH_MMAP_RND_BITS=24
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=16
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODVERSIONS=y
+# CONFIG_BLK_DEV_BSG is not set
+# CONFIG_IOSCHED_DEADLINE is not set
+CONFIG_ARCH_VEXPRESS=y
+CONFIG_NR_CPUS=4
+CONFIG_PREEMPT=y
+CONFIG_KSM=y
+CONFIG_SECCOMP=y
+CONFIG_ARMV8_DEPRECATED=y
+CONFIG_SWP_EMULATION=y
+CONFIG_CP15_BARRIER_EMULATION=y
+CONFIG_SETEND_EMULATION=y
+CONFIG_ARM64_SW_TTBR0_PAN=y
+CONFIG_RANDOMIZE_BASE=y
+CONFIG_CMDLINE="console=ttyAMA0"
+# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
+CONFIG_COMPAT=y
+CONFIG_PM_AUTOSLEEP=y
+CONFIG_PM_WAKELOCKS=y
+CONFIG_PM_WAKELOCKS_LIMIT=0
+# CONFIG_PM_WAKELOCKS_GC is not set
+CONFIG_PM_DEBUG=y
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_XFRM_USER=y
+CONFIG_NET_KEY=y
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_NET_IPVTI=y
+CONFIG_INET_ESP=y
+# CONFIG_INET_LRO is not set
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_IPV6_ROUTER_PREF=y
+CONFIG_IPV6_ROUTE_INFO=y
+CONFIG_IPV6_OPTIMISTIC_DAD=y
+CONFIG_INET6_AH=y
+CONFIG_INET6_ESP=y
+CONFIG_INET6_IPCOMP=y
+CONFIG_IPV6_MIP6=y
+CONFIG_IPV6_VTI=y
+CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_NETFILTER=y
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=y
+CONFIG_NF_CONNTRACK_FTP=y
+CONFIG_NF_CONNTRACK_H323=y
+CONFIG_NF_CONNTRACK_IRC=y
+CONFIG_NF_CONNTRACK_NETBIOS_NS=y
+CONFIG_NF_CONNTRACK_PPTP=y
+CONFIG_NF_CONNTRACK_SANE=y
+CONFIG_NF_CONNTRACK_TFTP=y
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=y
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=y
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=y
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=y
+CONFIG_NETFILTER_XT_TARGET_MARK=y
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+CONFIG_NETFILTER_XT_TARGET_TPROXY=y
+CONFIG_NETFILTER_XT_TARGET_TRACE=y
+CONFIG_NETFILTER_XT_TARGET_SECMARK=y
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=y
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=y
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=y
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=y
+CONFIG_NETFILTER_XT_MATCH_HELPER=y
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=y
+CONFIG_NETFILTER_XT_MATCH_LENGTH=y
+CONFIG_NETFILTER_XT_MATCH_LIMIT=y
+CONFIG_NETFILTER_XT_MATCH_MAC=y
+CONFIG_NETFILTER_XT_MATCH_MARK=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=y
+CONFIG_NETFILTER_XT_MATCH_QTAGUID=y
+CONFIG_NETFILTER_XT_MATCH_QUOTA=y
+CONFIG_NETFILTER_XT_MATCH_QUOTA2=y
+CONFIG_NETFILTER_XT_MATCH_SOCKET=y
+CONFIG_NETFILTER_XT_MATCH_STATE=y
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
+CONFIG_NETFILTER_XT_MATCH_STRING=y
+CONFIG_NETFILTER_XT_MATCH_TIME=y
+CONFIG_NETFILTER_XT_MATCH_U32=y
+CONFIG_NF_CONNTRACK_IPV4=y
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=y
+CONFIG_IP_NF_MATCH_ECN=y
+CONFIG_IP_NF_MATCH_RPFILTER=y
+CONFIG_IP_NF_MATCH_TTL=y
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_NAT=y
+CONFIG_IP_NF_TARGET_MASQUERADE=y
+CONFIG_IP_NF_TARGET_NETMAP=y
+CONFIG_IP_NF_TARGET_REDIRECT=y
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_TARGET_ECN=y
+CONFIG_IP_NF_TARGET_TTL=y
+CONFIG_IP_NF_RAW=y
+CONFIG_IP_NF_SECURITY=y
+CONFIG_IP_NF_ARPTABLES=y
+CONFIG_IP_NF_ARPFILTER=y
+CONFIG_IP_NF_ARP_MANGLE=y
+CONFIG_NF_CONNTRACK_IPV6=y
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=y
+CONFIG_IP6_NF_MATCH_EUI64=y
+CONFIG_IP6_NF_MATCH_FRAG=y
+CONFIG_IP6_NF_MATCH_OPTS=y
+CONFIG_IP6_NF_MATCH_HL=y
+CONFIG_IP6_NF_MATCH_IPV6HEADER=y
+CONFIG_IP6_NF_MATCH_MH=y
+CONFIG_IP6_NF_MATCH_RPFILTER=y
+CONFIG_IP6_NF_MATCH_RT=y
+CONFIG_IP6_NF_TARGET_HL=y
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_MANGLE=y
+CONFIG_IP6_NF_RAW=y
+CONFIG_BRIDGE=y
+CONFIG_NET_SCHED=y
+CONFIG_NET_SCH_HTB=y
+CONFIG_NET_CLS_U32=y
+CONFIG_NET_EMATCH=y
+CONFIG_NET_EMATCH_U32=y
+CONFIG_NET_CLS_ACT=y
+CONFIG_MAC80211=y
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_SIZE=8192
+CONFIG_VIRTIO_BLK=y
+CONFIG_UID_SYS_STATS=y
+CONFIG_MEMORY_STATE_TIME=y
+CONFIG_SCSI=y
+# CONFIG_SCSI_PROC_FS is not set
+CONFIG_BLK_DEV_SD=y
+# CONFIG_SCSI_LOWLEVEL is not set
+CONFIG_MD=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_CRYPT=y
+CONFIG_DM_UEVENT=y
+CONFIG_DM_VERITY=y
+CONFIG_DM_VERITY_FEC=y
+CONFIG_NETDEVICES=y
+CONFIG_TUN=y
+CONFIG_VIRTIO_NET=y
+CONFIG_SMC91X=y
+CONFIG_PPP=y
+CONFIG_PPP_BSDCOMP=y
+CONFIG_PPP_DEFLATE=y
+CONFIG_PPP_MPPE=y
+CONFIG_PPPOLAC=y
+CONFIG_PPPOPNS=y
+CONFIG_USB_USBNET=y
+CONFIG_INPUT_EVDEV=y
+CONFIG_INPUT_KEYRESET=y
+CONFIG_KEYBOARD_GOLDFISH_EVENTS=y
+CONFIG_KEYBOARD_GOLDFISH_ROTARY=y
+# CONFIG_INPUT_MOUSE is not set
+CONFIG_INPUT_JOYSTICK=y
+CONFIG_JOYSTICK_XPAD=y
+CONFIG_JOYSTICK_XPAD_FF=y
+CONFIG_JOYSTICK_XPAD_LEDS=y
+CONFIG_INPUT_TABLET=y
+CONFIG_TABLET_USB_ACECAD=y
+CONFIG_TABLET_USB_AIPTEK=y
+CONFIG_TABLET_USB_GTCO=y
+CONFIG_TABLET_USB_HANWANG=y
+CONFIG_TABLET_USB_KBTAB=y
+CONFIG_INPUT_MISC=y
+CONFIG_INPUT_KEYCHORD=y
+CONFIG_INPUT_UINPUT=y
+CONFIG_INPUT_GPIO=y
+# CONFIG_SERIO_SERPORT is not set
+# CONFIG_VT is not set
+# CONFIG_LEGACY_PTYS is not set
+# CONFIG_DEVMEM is not set
+# CONFIG_DEVKMEM is not set
+CONFIG_SERIAL_AMBA_PL011=y
+CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+CONFIG_VIRTIO_CONSOLE=y
+CONFIG_HW_RANDOM=y
+CONFIG_HW_RANDOM_VIRTIO=y
+CONFIG_BATTERY_GOLDFISH=y
+# CONFIG_HWMON is not set
+CONFIG_MEDIA_SUPPORT=y
+CONFIG_FB=y
+CONFIG_FB_GOLDFISH=y
+CONFIG_FB_SIMPLE=y
+CONFIG_BACKLIGHT_LCD_SUPPORT=y
+CONFIG_LOGO=y
+# CONFIG_LOGO_LINUX_MONO is not set
+# CONFIG_LOGO_LINUX_VGA16 is not set
+CONFIG_SOUND=y
+CONFIG_SND=y
+CONFIG_HIDRAW=y
+CONFIG_UHID=y
+CONFIG_HID_A4TECH=y
+CONFIG_HID_ACRUX=y
+CONFIG_HID_ACRUX_FF=y
+CONFIG_HID_APPLE=y
+CONFIG_HID_BELKIN=y
+CONFIG_HID_CHERRY=y
+CONFIG_HID_CHICONY=y
+CONFIG_HID_PRODIKEYS=y
+CONFIG_HID_CYPRESS=y
+CONFIG_HID_DRAGONRISE=y
+CONFIG_DRAGONRISE_FF=y
+CONFIG_HID_EMS_FF=y
+CONFIG_HID_ELECOM=y
+CONFIG_HID_EZKEY=y
+CONFIG_HID_HOLTEK=y
+CONFIG_HID_KEYTOUCH=y
+CONFIG_HID_KYE=y
+CONFIG_HID_UCLOGIC=y
+CONFIG_HID_WALTOP=y
+CONFIG_HID_GYRATION=y
+CONFIG_HID_TWINHAN=y
+CONFIG_HID_KENSINGTON=y
+CONFIG_HID_LCPOWER=y
+CONFIG_HID_LOGITECH=y
+CONFIG_HID_LOGITECH_DJ=y
+CONFIG_LOGITECH_FF=y
+CONFIG_LOGIRUMBLEPAD2_FF=y
+CONFIG_LOGIG940_FF=y
+CONFIG_HID_MAGICMOUSE=y
+CONFIG_HID_MICROSOFT=y
+CONFIG_HID_MONTEREY=y
+CONFIG_HID_MULTITOUCH=y
+CONFIG_HID_NTRIG=y
+CONFIG_HID_ORTEK=y
+CONFIG_HID_PANTHERLORD=y
+CONFIG_PANTHERLORD_FF=y
+CONFIG_HID_PETALYNX=y
+CONFIG_HID_PICOLCD=y
+CONFIG_HID_PRIMAX=y
+CONFIG_HID_ROCCAT=y
+CONFIG_HID_SAITEK=y
+CONFIG_HID_SAMSUNG=y
+CONFIG_HID_SONY=y
+CONFIG_HID_SPEEDLINK=y
+CONFIG_HID_SUNPLUS=y
+CONFIG_HID_GREENASIA=y
+CONFIG_GREENASIA_FF=y
+CONFIG_HID_SMARTJOYPLUS=y
+CONFIG_SMARTJOYPLUS_FF=y
+CONFIG_HID_TIVO=y
+CONFIG_HID_TOPSEED=y
+CONFIG_HID_THRUSTMASTER=y
+CONFIG_HID_WACOM=y
+CONFIG_HID_WIIMOTE=y
+CONFIG_HID_ZEROPLUS=y
+CONFIG_HID_ZYDACRON=y
+CONFIG_USB_HIDDEV=y
+CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_GADGET=y
+CONFIG_USB_CONFIGFS=y
+CONFIG_USB_CONFIGFS_F_FS=y
+CONFIG_USB_CONFIGFS_F_MTP=y
+CONFIG_USB_CONFIGFS_F_PTP=y
+CONFIG_USB_CONFIGFS_F_ACC=y
+CONFIG_USB_CONFIGFS_F_AUDIO_SRC=y
+CONFIG_USB_CONFIGFS_UEVENT=y
+CONFIG_USB_CONFIGFS_F_MIDI=y
+CONFIG_RTC_CLASS=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_STAGING=y
+CONFIG_ASHMEM=y
+CONFIG_ANDROID_TIMED_GPIO=y
+CONFIG_ANDROID_LOW_MEMORY_KILLER=y
+CONFIG_SYNC=y
+CONFIG_SW_SYNC=y
+CONFIG_SW_SYNC_USER=y
+CONFIG_ION=y
+CONFIG_GOLDFISH_AUDIO=y
+CONFIG_GOLDFISH=y
+CONFIG_GOLDFISH_PIPE=y
+# CONFIG_IOMMU_SUPPORT is not set
+CONFIG_ANDROID=y
+CONFIG_ANDROID_BINDER_IPC=y
+CONFIG_EXT2_FS=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_SECURITY=y
+CONFIG_QUOTA=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
+CONFIG_QFMT_V2=y
+CONFIG_FUSE_FS=y
+CONFIG_CUSE=y
+CONFIG_MSDOS_FS=y
+CONFIG_VFAT_FS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_SDCARD_FS=y
+CONFIG_PSTORE=y
+CONFIG_PSTORE_CONSOLE=y
+CONFIG_PSTORE_RAM=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_DEBUG_INFO=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_PANIC_TIMEOUT=5
+CONFIG_SCHEDSTATS=y
+CONFIG_TIMER_STATS=y
+CONFIG_ENABLE_DEFAULT_TRACERS=y
+CONFIG_ATOMIC64_SELFTEST=y
+CONFIG_KEYS=y
+CONFIG_SECURITY_PERF_EVENTS_RESTRICT=y
+CONFIG_SECURITY=y
+CONFIG_SECURITY_NETWORK=y
+CONFIG_HARDENED_USERCOPY=y
+CONFIG_SECURITY_SELINUX=y
+CONFIG_CRYPTO_SHA512=y
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9..487cd38 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -23,6 +23,11 @@
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_HASH
+config CRYPTO_POLY_HASH_ARM64_CE
+ tristate "poly_hash (for HEH encryption mode) using ARMv8 Crypto Extensions"
+ depends on ARM64 && KERNEL_MODE_NEON
+ select CRYPTO_HASH
+
config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
@@ -53,4 +58,11 @@
tristate "CRC32 and CRC32C using optional ARMv8 instructions"
depends on ARM64
select CRYPTO_HASH
+
+config CRYPTO_SPECK_NEON
+ tristate "NEON accelerated Speck cipher algorithms"
+ depends on KERNEL_MODE_NEON
+ select CRYPTO_BLKCIPHER
+ select CRYPTO_GF128MUL
+ select CRYPTO_SPECK
endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index abb79b3..28a7124 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -17,6 +17,9 @@
obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
+obj-$(CONFIG_CRYPTO_POLY_HASH_ARM64_CE) += poly-hash-ce.o
+poly-hash-ce-y := poly-hash-ce-glue.o poly-hash-ce-core.o
+
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
@@ -29,6 +32,9 @@
obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
aes-neon-blk-y := aes-glue-neon.o aes-neon.o
+obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
+speck-neon-y := speck-neon-core.o speck-neon-glue.o
+
AFLAGS_aes-ce.o := -DINTERLEAVE=4
AFLAGS_aes-neon.o := -DINTERLEAVE=4
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index 6a51dfc..448b874a 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -294,7 +294,7 @@
.cra_blkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
- .ivsize = AES_BLOCK_SIZE,
+ .ivsize = 0,
.setkey = aes_setkey,
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
@@ -371,7 +371,7 @@
.cra_ablkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
- .ivsize = AES_BLOCK_SIZE,
+ .ivsize = 0,
.setkey = ablk_set_key,
.encrypt = ablk_encrypt,
.decrypt = ablk_decrypt,
diff --git a/arch/arm64/crypto/poly-hash-ce-core.S b/arch/arm64/crypto/poly-hash-ce-core.S
new file mode 100644
index 0000000..8ccb544
--- /dev/null
+++ b/arch/arm64/crypto/poly-hash-ce-core.S
@@ -0,0 +1,163 @@
+/*
+ * Accelerated poly_hash implementation with ARMv8 PMULL instructions.
+ *
+ * Based on ghash-ce-core.S.
+ *
+ * Copyright (C) 2014 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ * Copyright (C) 2017 Google, Inc. <ebiggers@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ KEY .req v0
+ KEY2 .req v1
+ T1 .req v2
+ T2 .req v3
+ GSTAR .req v4
+ XL .req v5
+ XM .req v6
+ XH .req v7
+
+ .text
+ .arch armv8-a+crypto
+
+ /* 16-byte aligned (2**4 = 16); not required, but might as well */
+ .align 4
+.Lgstar:
+ .quad 0x87, 0x87
+
+/*
+ * void pmull_poly_hash_update(le128 *digest, const le128 *key,
+ * const u8 *src, unsigned int blocks,
+ * unsigned int partial);
+ */
+ENTRY(pmull_poly_hash_update)
+
+ /* Load digest into XL */
+ ld1 {XL.16b}, [x0]
+
+ /* Load key into KEY */
+ ld1 {KEY.16b}, [x1]
+
+ /* Load g*(x) = g(x) + x^128 = x^7 + x^2 + x + 1 into both halves of
+ * GSTAR */
+ adr x1, .Lgstar
+ ld1 {GSTAR.2d}, [x1]
+
+ /* Set KEY2 to (KEY[1]+KEY[0]):(KEY[1]+KEY[0]). This is needed for
+ * Karatsuba multiplication. */
+ ext KEY2.16b, KEY.16b, KEY.16b, #8
+ eor KEY2.16b, KEY2.16b, KEY.16b
+
+ /* If 'partial' is nonzero, then we're finishing a pending block and
+ * should go right to the multiplication. */
+ cbnz w4, 1f
+
+0:
+ /* Add the next block from 'src' to the digest */
+ ld1 {T1.16b}, [x2], #16
+ eor XL.16b, XL.16b, T1.16b
+ sub w3, w3, #1
+
+1:
+ /*
+ * Multiply the current 128-bit digest (a1:a0, in XL) by the 128-bit key
+ * (b1:b0, in KEY) using Karatsuba multiplication.
+ */
+
+ /* T1 = (a1+a0):(a1+a0) */
+ ext T1.16b, XL.16b, XL.16b, #8
+ eor T1.16b, T1.16b, XL.16b
+
+ /* XH = a1 * b1 */
+ pmull2 XH.1q, XL.2d, KEY.2d
+
+ /* XL = a0 * b0 */
+ pmull XL.1q, XL.1d, KEY.1d
+
+ /* XM = (a1+a0) * (b1+b0) */
+ pmull XM.1q, T1.1d, KEY2.1d
+
+ /* XM += (XH[0]:XL[1]) + XL + XH */
+ ext T1.16b, XL.16b, XH.16b, #8
+ eor T2.16b, XL.16b, XH.16b
+ eor XM.16b, XM.16b, T1.16b
+ eor XM.16b, XM.16b, T2.16b
+
+ /*
+ * Now the 256-bit product is in XH[1]:XM:XL[0]. It represents a
+ * polynomial over GF(2) with degree as large as 255. We need to
+ * compute its remainder modulo g(x) = x^128+x^7+x^2+x+1. For this it
+ * is sufficient to compute the remainder of the high half 'c(x)x^128'
+ * add it to the low half. To reduce the high half we use the Barrett
+ * reduction method. The basic idea is that we can express the
+ * remainder p(x) as g(x)q(x) mod x^128, where q(x) = (c(x)x^128)/g(x).
+ * As detailed in [1], to avoid having to divide by g(x) at runtime the
+ * following equivalent expression can be derived:
+ *
+ * p(x) = [ g*(x)((c(x)q+(x))/x^128) ] mod x^128
+ *
+ * where g*(x) = x^128+g(x) = x^7+x^2+x+1, and q+(x) = x^256/g(x) = g(x)
+ * in this case. This is also equivalent to:
+ *
+ * p(x) = [ g*(x)((c(x)(x^128 + g*(x)))/x^128) ] mod x^128
+ * = [ g*(x)(c(x) + (c(x)g*(x))/x^128) ] mod x^128
+ *
+ * Since deg g*(x) < 64:
+ *
+ * p(x) = [ g*(x)(c(x) + ((c(x)/x^64)g*(x))/x^64) ] mod x^128
+ * = [ g*(x)((c(x)/x^64)x^64 + (c(x) mod x^64) +
+ * ((c(x)/x^64)g*(x))/x^64) ] mod x^128
+ *
+ * Letting t(x) = g*(x)(c(x)/x^64):
+ *
+ * p(x) = [ t(x)x^64 + g*(x)((c(x) mod x^64) + t(x)/x^64) ] mod x^128
+ *
+ * Therefore, to do the reduction we only need to issue two 64-bit =>
+ * 128-bit carryless multiplications: g*(x) times c(x)/x^64, and g*(x)
+ * times ((c(x) mod x^64) + t(x)/x^64). (Multiplication by x^64 doesn't
+ * count since it is simply a shift or move.)
+ *
+ * An alternate reduction method, also based on Barrett reduction and
+ * described in [1], uses only shifts and XORs --- no multiplications.
+ * However, the method with multiplications requires fewer instructions
+ * and is faster on processors with fast carryless multiplication.
+ *
+ * [1] "Intel Carry-Less Multiplication Instruction and its Usage for
+ * Computing the GCM Mode",
+ * https://software.intel.com/sites/default/files/managed/72/cc/clmul-wp-rev-2.02-2014-04-20.pdf
+ */
+
+ /* 256-bit product is XH[1]:XM:XL[0], so c(x) is XH[1]:XM[1] */
+
+ /* T1 = t(x) = g*(x)(c(x)/x^64) */
+ pmull2 T1.1q, GSTAR.2d, XH.2d
+
+ /* T2 = g*(x)((c(x) mod x^64) + t(x)/x^64) */
+ eor T2.16b, XM.16b, T1.16b
+ pmull2 T2.1q, GSTAR.2d, T2.2d
+
+ /* Make XL[0] be the low half of the 128-bit result by adding the low 64
+ * bits of the T2 term to what was already there. The 't(x)x^64' term
+ * makes no difference, so skip it. */
+ eor XL.16b, XL.16b, T2.16b
+
+ /* Make XL[1] be the high half of the 128-bit result by adding the high
+ * 64 bits of the 't(x)x^64' and T2 terms to what was already in XM[0],
+ * then moving XM[0] to XL[1]. */
+ eor XM.16b, XM.16b, T1.16b
+ ext T2.16b, T2.16b, T2.16b, #8
+ eor XM.16b, XM.16b, T2.16b
+ mov XL.d[1], XM.d[0]
+
+ /* If more blocks remain, then loop back to process the next block;
+ * else, store the digest and return. */
+ cbnz w3, 0b
+ st1 {XL.16b}, [x0]
+ ret
+ENDPROC(pmull_poly_hash_update)
diff --git a/arch/arm64/crypto/poly-hash-ce-glue.c b/arch/arm64/crypto/poly-hash-ce-glue.c
new file mode 100644
index 0000000..e195740
--- /dev/null
+++ b/arch/arm64/crypto/poly-hash-ce-glue.c
@@ -0,0 +1,166 @@
+/*
+ * Accelerated poly_hash implementation with ARMv8 PMULL instructions.
+ *
+ * Based on ghash-ce-glue.c.
+ *
+ * poly_hash is part of the HEH (Hash-Encrypt-Hash) encryption mode, proposed in
+ * Internet Draft https://tools.ietf.org/html/draft-cope-heh-01.
+ *
+ * poly_hash is very similar to GHASH: both algorithms are keyed hashes which
+ * interpret their input data as coefficients of a polynomial over GF(2^128),
+ * then calculate a hash value by evaluating that polynomial at the point given
+ * by the key, e.g. using Horner's rule. The difference is that poly_hash uses
+ * the more natural "ble" convention to represent GF(2^128) elements, whereas
+ * GHASH uses the less natural "lle" convention (see include/crypto/gf128mul.h).
+ * The ble convention makes it simpler to implement GF(2^128) multiplication.
+ *
+ * Copyright (C) 2014 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ * Copyright (C) 2017 Google Inc. <ebiggers@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include <asm/neon.h>
+#include <crypto/b128ops.h>
+#include <crypto/internal/hash.h>
+#include <linux/cpufeature.h>
+#include <linux/crypto.h>
+#include <linux/module.h>
+
+/*
+ * Note: in this algorithm we currently use 'le128' to represent GF(2^128)
+ * elements, even though poly_hash-generic uses 'be128'. Both types are
+ * actually "wrong" because the elements are actually in 'ble' format, and there
+ * should be a ble type to represent this --- as well as lle, bbe, and lbe types
+ * for the other conventions for representing GF(2^128) elements. But
+ * practically it doesn't matter which type we choose here, so we just use le128
+ * since it's arguably more accurate, while poly_hash-generic still has to use
+ * be128 because the generic GF(2^128) multiplication functions all take be128.
+ */
+
+struct poly_hash_desc_ctx {
+ le128 digest;
+ unsigned int count;
+};
+
+asmlinkage void pmull_poly_hash_update(le128 *digest, const le128 *key,
+ const u8 *src, unsigned int blocks,
+ unsigned int partial);
+
+static int poly_hash_setkey(struct crypto_shash *tfm,
+ const u8 *key, unsigned int keylen)
+{
+ if (keylen != sizeof(le128)) {
+ crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+ return -EINVAL;
+ }
+
+ memcpy(crypto_shash_ctx(tfm), key, sizeof(le128));
+ return 0;
+}
+
+static int poly_hash_init(struct shash_desc *desc)
+{
+ struct poly_hash_desc_ctx *ctx = shash_desc_ctx(desc);
+
+ ctx->digest = (le128) { 0 };
+ ctx->count = 0;
+ return 0;
+}
+
+static int poly_hash_update(struct shash_desc *desc, const u8 *src,
+ unsigned int len)
+{
+ struct poly_hash_desc_ctx *ctx = shash_desc_ctx(desc);
+ unsigned int partial = ctx->count % sizeof(le128);
+ u8 *dst = (u8 *)&ctx->digest + partial;
+
+ ctx->count += len;
+
+ /* Finishing at least one block? */
+ if (partial + len >= sizeof(le128)) {
+ const le128 *key = crypto_shash_ctx(desc->tfm);
+
+ if (partial) {
+ /* Finish the pending block. */
+ unsigned int n = sizeof(le128) - partial;
+
+ len -= n;
+ do {
+ *dst++ ^= *src++;
+ } while (--n);
+ }
+
+ /*
+ * Do the real work. If 'partial' is nonzero, this starts by
+ * multiplying 'digest' by 'key'. Then for each additional full
+ * block it adds the block to 'digest' and multiplies by 'key'.
+ */
+ kernel_neon_begin_partial(8);
+ pmull_poly_hash_update(&ctx->digest, key, src,
+ len / sizeof(le128), partial);
+ kernel_neon_end();
+
+ src += len - (len % sizeof(le128));
+ len %= sizeof(le128);
+ dst = (u8 *)&ctx->digest;
+ }
+
+ /* Continue adding the next block to 'digest'. */
+ while (len--)
+ *dst++ ^= *src++;
+ return 0;
+}
+
+static int poly_hash_final(struct shash_desc *desc, u8 *out)
+{
+ struct poly_hash_desc_ctx *ctx = shash_desc_ctx(desc);
+ unsigned int partial = ctx->count % sizeof(le128);
+
+ /* Finish the last block if needed. */
+ if (partial) {
+ const le128 *key = crypto_shash_ctx(desc->tfm);
+
+ kernel_neon_begin_partial(8);
+ pmull_poly_hash_update(&ctx->digest, key, NULL, 0, partial);
+ kernel_neon_end();
+ }
+
+ memcpy(out, &ctx->digest, sizeof(le128));
+ return 0;
+}
+
+static struct shash_alg poly_hash_alg = {
+ .digestsize = sizeof(le128),
+ .init = poly_hash_init,
+ .update = poly_hash_update,
+ .final = poly_hash_final,
+ .setkey = poly_hash_setkey,
+ .descsize = sizeof(struct poly_hash_desc_ctx),
+ .base = {
+ .cra_name = "poly_hash",
+ .cra_driver_name = "poly_hash-ce",
+ .cra_priority = 300,
+ .cra_ctxsize = sizeof(le128),
+ .cra_module = THIS_MODULE,
+ },
+};
+
+static int __init poly_hash_ce_mod_init(void)
+{
+ return crypto_register_shash(&poly_hash_alg);
+}
+
+static void __exit poly_hash_ce_mod_exit(void)
+{
+ crypto_unregister_shash(&poly_hash_alg);
+}
+
+MODULE_DESCRIPTION("Polynomial evaluation hash using ARMv8 Crypto Extensions");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_LICENSE("GPL v2");
+
+module_cpu_feature_match(PMULL, poly_hash_ce_mod_init);
+module_exit(poly_hash_ce_mod_exit);
diff --git a/arch/arm64/crypto/sha1-ce-core.S b/arch/arm64/crypto/sha1-ce-core.S
index c98e7e8..85504087 100644
--- a/arch/arm64/crypto/sha1-ce-core.S
+++ b/arch/arm64/crypto/sha1-ce-core.S
@@ -82,7 +82,8 @@
ldr dgb, [x0, #16]
/* load sha1_ce_state::finalize */
- ldr w4, [x0, #:lo12:sha1_ce_offsetof_finalize]
+ ldr_l w4, sha1_ce_offsetof_finalize, x4
+ ldr w4, [x0, x4]
/* load input */
0: ld1 {v8.4s-v11.4s}, [x1], #64
@@ -132,7 +133,8 @@
* the padding is handled by the C code in that case.
*/
cbz x4, 3f
- ldr x4, [x0, #:lo12:sha1_ce_offsetof_count]
+ ldr_l w4, sha1_ce_offsetof_count, x4
+ ldr x4, [x0, x4]
movi v9.2d, #0
mov x8, #0x80000000
movi v10.2d, #0
diff --git a/arch/arm64/crypto/sha1-ce-glue.c b/arch/arm64/crypto/sha1-ce-glue.c
index aefda986..ea319c0 100644
--- a/arch/arm64/crypto/sha1-ce-glue.c
+++ b/arch/arm64/crypto/sha1-ce-glue.c
@@ -17,9 +17,6 @@
#include <linux/crypto.h>
#include <linux/module.h>
-#define ASM_EXPORT(sym, val) \
- asm(".globl " #sym "; .set " #sym ", %0" :: "I"(val));
-
MODULE_DESCRIPTION("SHA1 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -32,6 +29,9 @@
asmlinkage void sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
int blocks);
+const u32 sha1_ce_offsetof_count = offsetof(struct sha1_ce_state, sst.count);
+const u32 sha1_ce_offsetof_finalize = offsetof(struct sha1_ce_state, finalize);
+
static int sha1_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
@@ -52,11 +52,6 @@
struct sha1_ce_state *sctx = shash_desc_ctx(desc);
bool finalize = !sctx->sst.count && !(len % SHA1_BLOCK_SIZE);
- ASM_EXPORT(sha1_ce_offsetof_count,
- offsetof(struct sha1_ce_state, sst.count));
- ASM_EXPORT(sha1_ce_offsetof_finalize,
- offsetof(struct sha1_ce_state, finalize));
-
/*
* Allow the asm code to perform the finalization if there is no
* partial data and the input is a round multiple of the block size.
diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S
index 01cfee0..679c6c0 100644
--- a/arch/arm64/crypto/sha2-ce-core.S
+++ b/arch/arm64/crypto/sha2-ce-core.S
@@ -88,7 +88,8 @@
ld1 {dgav.4s, dgbv.4s}, [x0]
/* load sha256_ce_state::finalize */
- ldr w4, [x0, #:lo12:sha256_ce_offsetof_finalize]
+ ldr_l w4, sha256_ce_offsetof_finalize, x4
+ ldr w4, [x0, x4]
/* load input */
0: ld1 {v16.4s-v19.4s}, [x1], #64
@@ -136,7 +137,8 @@
* the padding is handled by the C code in that case.
*/
cbz x4, 3f
- ldr x4, [x0, #:lo12:sha256_ce_offsetof_count]
+ ldr_l w4, sha256_ce_offsetof_count, x4
+ ldr x4, [x0, x4]
movi v17.2d, #0
mov x8, #0x80000000
movi v18.2d, #0
diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c
index 7cd5875..0ed9486 100644
--- a/arch/arm64/crypto/sha2-ce-glue.c
+++ b/arch/arm64/crypto/sha2-ce-glue.c
@@ -17,9 +17,6 @@
#include <linux/crypto.h>
#include <linux/module.h>
-#define ASM_EXPORT(sym, val) \
- asm(".globl " #sym "; .set " #sym ", %0" :: "I"(val));
-
MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
@@ -32,6 +29,11 @@
asmlinkage void sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src,
int blocks);
+const u32 sha256_ce_offsetof_count = offsetof(struct sha256_ce_state,
+ sst.count);
+const u32 sha256_ce_offsetof_finalize = offsetof(struct sha256_ce_state,
+ finalize);
+
static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
@@ -52,11 +54,6 @@
struct sha256_ce_state *sctx = shash_desc_ctx(desc);
bool finalize = !sctx->sst.count && !(len % SHA256_BLOCK_SIZE);
- ASM_EXPORT(sha256_ce_offsetof_count,
- offsetof(struct sha256_ce_state, sst.count));
- ASM_EXPORT(sha256_ce_offsetof_finalize,
- offsetof(struct sha256_ce_state, finalize));
-
/*
* Allow the asm code to perform the finalization if there is no
* partial data and the input is a round multiple of the block size.
diff --git a/arch/arm64/crypto/speck-neon-core.S b/arch/arm64/crypto/speck-neon-core.S
new file mode 100644
index 0000000..b144634
--- /dev/null
+++ b/arch/arm64/crypto/speck-neon-core.S
@@ -0,0 +1,352 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM64 NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
+ *
+ * Copyright (c) 2018 Google, Inc
+ *
+ * Author: Eric Biggers <ebiggers@google.com>
+ */
+
+#include <linux/linkage.h>
+
+ .text
+
+ // arguments
+ ROUND_KEYS .req x0 // const {u64,u32} *round_keys
+ NROUNDS .req w1 // int nrounds
+ NROUNDS_X .req x1
+ DST .req x2 // void *dst
+ SRC .req x3 // const void *src
+ NBYTES .req w4 // unsigned int nbytes
+ TWEAK .req x5 // void *tweak
+
+ // registers which hold the data being encrypted/decrypted
+ // (underscores avoid a naming collision with ARM64 registers x0-x3)
+ X_0 .req v0
+ Y_0 .req v1
+ X_1 .req v2
+ Y_1 .req v3
+ X_2 .req v4
+ Y_2 .req v5
+ X_3 .req v6
+ Y_3 .req v7
+
+ // the round key, duplicated in all lanes
+ ROUND_KEY .req v8
+
+ // index vector for tbl-based 8-bit rotates
+ ROTATE_TABLE .req v9
+ ROTATE_TABLE_Q .req q9
+
+ // temporary registers
+ TMP0 .req v10
+ TMP1 .req v11
+ TMP2 .req v12
+ TMP3 .req v13
+
+ // multiplication table for updating XTS tweaks
+ GFMUL_TABLE .req v14
+ GFMUL_TABLE_Q .req q14
+
+ // next XTS tweak value(s)
+ TWEAKV_NEXT .req v15
+
+ // XTS tweaks for the blocks currently being encrypted/decrypted
+ TWEAKV0 .req v16
+ TWEAKV1 .req v17
+ TWEAKV2 .req v18
+ TWEAKV3 .req v19
+ TWEAKV4 .req v20
+ TWEAKV5 .req v21
+ TWEAKV6 .req v22
+ TWEAKV7 .req v23
+
+ .align 4
+.Lror64_8_table:
+ .octa 0x080f0e0d0c0b0a090007060504030201
+.Lror32_8_table:
+ .octa 0x0c0f0e0d080b0a090407060500030201
+.Lrol64_8_table:
+ .octa 0x0e0d0c0b0a09080f0605040302010007
+.Lrol32_8_table:
+ .octa 0x0e0d0c0f0a09080b0605040702010003
+.Lgf128mul_table:
+ .octa 0x00000000000000870000000000000001
+.Lgf64mul_table:
+ .octa 0x0000000000000000000000002d361b00
+
+/*
+ * _speck_round_128bytes() - Speck encryption round on 128 bytes at a time
+ *
+ * Do one Speck encryption round on the 128 bytes (8 blocks for Speck128, 16 for
+ * Speck64) stored in X0-X3 and Y0-Y3, using the round key stored in all lanes
+ * of ROUND_KEY. 'n' is the lane size: 64 for Speck128, or 32 for Speck64.
+ * 'lanes' is the lane specifier: "2d" for Speck128 or "4s" for Speck64.
+ */
+.macro _speck_round_128bytes n, lanes
+
+ // x = ror(x, 8)
+ tbl X_0.16b, {X_0.16b}, ROTATE_TABLE.16b
+ tbl X_1.16b, {X_1.16b}, ROTATE_TABLE.16b
+ tbl X_2.16b, {X_2.16b}, ROTATE_TABLE.16b
+ tbl X_3.16b, {X_3.16b}, ROTATE_TABLE.16b
+
+ // x += y
+ add X_0.\lanes, X_0.\lanes, Y_0.\lanes
+ add X_1.\lanes, X_1.\lanes, Y_1.\lanes
+ add X_2.\lanes, X_2.\lanes, Y_2.\lanes
+ add X_3.\lanes, X_3.\lanes, Y_3.\lanes
+
+ // x ^= k
+ eor X_0.16b, X_0.16b, ROUND_KEY.16b
+ eor X_1.16b, X_1.16b, ROUND_KEY.16b
+ eor X_2.16b, X_2.16b, ROUND_KEY.16b
+ eor X_3.16b, X_3.16b, ROUND_KEY.16b
+
+ // y = rol(y, 3)
+ shl TMP0.\lanes, Y_0.\lanes, #3
+ shl TMP1.\lanes, Y_1.\lanes, #3
+ shl TMP2.\lanes, Y_2.\lanes, #3
+ shl TMP3.\lanes, Y_3.\lanes, #3
+ sri TMP0.\lanes, Y_0.\lanes, #(\n - 3)
+ sri TMP1.\lanes, Y_1.\lanes, #(\n - 3)
+ sri TMP2.\lanes, Y_2.\lanes, #(\n - 3)
+ sri TMP3.\lanes, Y_3.\lanes, #(\n - 3)
+
+ // y ^= x
+ eor Y_0.16b, TMP0.16b, X_0.16b
+ eor Y_1.16b, TMP1.16b, X_1.16b
+ eor Y_2.16b, TMP2.16b, X_2.16b
+ eor Y_3.16b, TMP3.16b, X_3.16b
+.endm
+
+/*
+ * _speck_unround_128bytes() - Speck decryption round on 128 bytes at a time
+ *
+ * This is the inverse of _speck_round_128bytes().
+ */
+.macro _speck_unround_128bytes n, lanes
+
+ // y ^= x
+ eor TMP0.16b, Y_0.16b, X_0.16b
+ eor TMP1.16b, Y_1.16b, X_1.16b
+ eor TMP2.16b, Y_2.16b, X_2.16b
+ eor TMP3.16b, Y_3.16b, X_3.16b
+
+ // y = ror(y, 3)
+ ushr Y_0.\lanes, TMP0.\lanes, #3
+ ushr Y_1.\lanes, TMP1.\lanes, #3
+ ushr Y_2.\lanes, TMP2.\lanes, #3
+ ushr Y_3.\lanes, TMP3.\lanes, #3
+ sli Y_0.\lanes, TMP0.\lanes, #(\n - 3)
+ sli Y_1.\lanes, TMP1.\lanes, #(\n - 3)
+ sli Y_2.\lanes, TMP2.\lanes, #(\n - 3)
+ sli Y_3.\lanes, TMP3.\lanes, #(\n - 3)
+
+ // x ^= k
+ eor X_0.16b, X_0.16b, ROUND_KEY.16b
+ eor X_1.16b, X_1.16b, ROUND_KEY.16b
+ eor X_2.16b, X_2.16b, ROUND_KEY.16b
+ eor X_3.16b, X_3.16b, ROUND_KEY.16b
+
+ // x -= y
+ sub X_0.\lanes, X_0.\lanes, Y_0.\lanes
+ sub X_1.\lanes, X_1.\lanes, Y_1.\lanes
+ sub X_2.\lanes, X_2.\lanes, Y_2.\lanes
+ sub X_3.\lanes, X_3.\lanes, Y_3.\lanes
+
+ // x = rol(x, 8)
+ tbl X_0.16b, {X_0.16b}, ROTATE_TABLE.16b
+ tbl X_1.16b, {X_1.16b}, ROTATE_TABLE.16b
+ tbl X_2.16b, {X_2.16b}, ROTATE_TABLE.16b
+ tbl X_3.16b, {X_3.16b}, ROTATE_TABLE.16b
+.endm
+
+.macro _next_xts_tweak next, cur, tmp, n
+.if \n == 64
+ /*
+ * Calculate the next tweak by multiplying the current one by x,
+ * modulo p(x) = x^128 + x^7 + x^2 + x + 1.
+ */
+ sshr \tmp\().2d, \cur\().2d, #63
+ and \tmp\().16b, \tmp\().16b, GFMUL_TABLE.16b
+ shl \next\().2d, \cur\().2d, #1
+ ext \tmp\().16b, \tmp\().16b, \tmp\().16b, #8
+ eor \next\().16b, \next\().16b, \tmp\().16b
+.else
+ /*
+ * Calculate the next two tweaks by multiplying the current ones by x^2,
+ * modulo p(x) = x^64 + x^4 + x^3 + x + 1.
+ */
+ ushr \tmp\().2d, \cur\().2d, #62
+ shl \next\().2d, \cur\().2d, #2
+ tbl \tmp\().16b, {GFMUL_TABLE.16b}, \tmp\().16b
+ eor \next\().16b, \next\().16b, \tmp\().16b
+.endif
+.endm
+
+/*
+ * _speck_xts_crypt() - Speck-XTS encryption/decryption
+ *
+ * Encrypt or decrypt NBYTES bytes of data from the SRC buffer to the DST buffer
+ * using Speck-XTS, specifically the variant with a block size of '2n' and round
+ * count given by NROUNDS. The expanded round keys are given in ROUND_KEYS, and
+ * the current XTS tweak value is given in TWEAK. It's assumed that NBYTES is a
+ * nonzero multiple of 128.
+ */
+.macro _speck_xts_crypt n, lanes, decrypting
+
+ /*
+ * If decrypting, modify the ROUND_KEYS parameter to point to the last
+ * round key rather than the first, since for decryption the round keys
+ * are used in reverse order.
+ */
+.if \decrypting
+ mov NROUNDS, NROUNDS /* zero the high 32 bits */
+.if \n == 64
+ add ROUND_KEYS, ROUND_KEYS, NROUNDS_X, lsl #3
+ sub ROUND_KEYS, ROUND_KEYS, #8
+.else
+ add ROUND_KEYS, ROUND_KEYS, NROUNDS_X, lsl #2
+ sub ROUND_KEYS, ROUND_KEYS, #4
+.endif
+.endif
+
+ // Load the index vector for tbl-based 8-bit rotates
+.if \decrypting
+ ldr ROTATE_TABLE_Q, .Lrol\n\()_8_table
+.else
+ ldr ROTATE_TABLE_Q, .Lror\n\()_8_table
+.endif
+
+ // One-time XTS preparation
+.if \n == 64
+ // Load first tweak
+ ld1 {TWEAKV0.16b}, [TWEAK]
+
+ // Load GF(2^128) multiplication table
+ ldr GFMUL_TABLE_Q, .Lgf128mul_table
+.else
+ // Load first tweak
+ ld1 {TWEAKV0.8b}, [TWEAK]
+
+ // Load GF(2^64) multiplication table
+ ldr GFMUL_TABLE_Q, .Lgf64mul_table
+
+ // Calculate second tweak, packing it together with the first
+ ushr TMP0.2d, TWEAKV0.2d, #63
+ shl TMP1.2d, TWEAKV0.2d, #1
+ tbl TMP0.8b, {GFMUL_TABLE.16b}, TMP0.8b
+ eor TMP0.8b, TMP0.8b, TMP1.8b
+ mov TWEAKV0.d[1], TMP0.d[0]
+.endif
+
+.Lnext_128bytes_\@:
+
+ // Calculate XTS tweaks for next 128 bytes
+ _next_xts_tweak TWEAKV1, TWEAKV0, TMP0, \n
+ _next_xts_tweak TWEAKV2, TWEAKV1, TMP0, \n
+ _next_xts_tweak TWEAKV3, TWEAKV2, TMP0, \n
+ _next_xts_tweak TWEAKV4, TWEAKV3, TMP0, \n
+ _next_xts_tweak TWEAKV5, TWEAKV4, TMP0, \n
+ _next_xts_tweak TWEAKV6, TWEAKV5, TMP0, \n
+ _next_xts_tweak TWEAKV7, TWEAKV6, TMP0, \n
+ _next_xts_tweak TWEAKV_NEXT, TWEAKV7, TMP0, \n
+
+ // Load the next source blocks into {X,Y}[0-3]
+ ld1 {X_0.16b-Y_1.16b}, [SRC], #64
+ ld1 {X_2.16b-Y_3.16b}, [SRC], #64
+
+ // XOR the source blocks with their XTS tweaks
+ eor TMP0.16b, X_0.16b, TWEAKV0.16b
+ eor Y_0.16b, Y_0.16b, TWEAKV1.16b
+ eor TMP1.16b, X_1.16b, TWEAKV2.16b
+ eor Y_1.16b, Y_1.16b, TWEAKV3.16b
+ eor TMP2.16b, X_2.16b, TWEAKV4.16b
+ eor Y_2.16b, Y_2.16b, TWEAKV5.16b
+ eor TMP3.16b, X_3.16b, TWEAKV6.16b
+ eor Y_3.16b, Y_3.16b, TWEAKV7.16b
+
+ /*
+ * De-interleave the 'x' and 'y' elements of each block, i.e. make it so
+ * that the X[0-3] registers contain only the second halves of blocks,
+ * and the Y[0-3] registers contain only the first halves of blocks.
+ * (Speck uses the order (y, x) rather than the more intuitive (x, y).)
+ */
+ uzp2 X_0.\lanes, TMP0.\lanes, Y_0.\lanes
+ uzp1 Y_0.\lanes, TMP0.\lanes, Y_0.\lanes
+ uzp2 X_1.\lanes, TMP1.\lanes, Y_1.\lanes
+ uzp1 Y_1.\lanes, TMP1.\lanes, Y_1.\lanes
+ uzp2 X_2.\lanes, TMP2.\lanes, Y_2.\lanes
+ uzp1 Y_2.\lanes, TMP2.\lanes, Y_2.\lanes
+ uzp2 X_3.\lanes, TMP3.\lanes, Y_3.\lanes
+ uzp1 Y_3.\lanes, TMP3.\lanes, Y_3.\lanes
+
+ // Do the cipher rounds
+ mov x6, ROUND_KEYS
+ mov w7, NROUNDS
+.Lnext_round_\@:
+.if \decrypting
+ ld1r {ROUND_KEY.\lanes}, [x6]
+ sub x6, x6, #( \n / 8 )
+ _speck_unround_128bytes \n, \lanes
+.else
+ ld1r {ROUND_KEY.\lanes}, [x6], #( \n / 8 )
+ _speck_round_128bytes \n, \lanes
+.endif
+ subs w7, w7, #1
+ bne .Lnext_round_\@
+
+ // Re-interleave the 'x' and 'y' elements of each block
+ zip1 TMP0.\lanes, Y_0.\lanes, X_0.\lanes
+ zip2 Y_0.\lanes, Y_0.\lanes, X_0.\lanes
+ zip1 TMP1.\lanes, Y_1.\lanes, X_1.\lanes
+ zip2 Y_1.\lanes, Y_1.\lanes, X_1.\lanes
+ zip1 TMP2.\lanes, Y_2.\lanes, X_2.\lanes
+ zip2 Y_2.\lanes, Y_2.\lanes, X_2.\lanes
+ zip1 TMP3.\lanes, Y_3.\lanes, X_3.\lanes
+ zip2 Y_3.\lanes, Y_3.\lanes, X_3.\lanes
+
+ // XOR the encrypted/decrypted blocks with the tweaks calculated earlier
+ eor X_0.16b, TMP0.16b, TWEAKV0.16b
+ eor Y_0.16b, Y_0.16b, TWEAKV1.16b
+ eor X_1.16b, TMP1.16b, TWEAKV2.16b
+ eor Y_1.16b, Y_1.16b, TWEAKV3.16b
+ eor X_2.16b, TMP2.16b, TWEAKV4.16b
+ eor Y_2.16b, Y_2.16b, TWEAKV5.16b
+ eor X_3.16b, TMP3.16b, TWEAKV6.16b
+ eor Y_3.16b, Y_3.16b, TWEAKV7.16b
+ mov TWEAKV0.16b, TWEAKV_NEXT.16b
+
+ // Store the ciphertext in the destination buffer
+ st1 {X_0.16b-Y_1.16b}, [DST], #64
+ st1 {X_2.16b-Y_3.16b}, [DST], #64
+
+ // Continue if there are more 128-byte chunks remaining
+ subs NBYTES, NBYTES, #128
+ bne .Lnext_128bytes_\@
+
+ // Store the next tweak and return
+.if \n == 64
+ st1 {TWEAKV_NEXT.16b}, [TWEAK]
+.else
+ st1 {TWEAKV_NEXT.8b}, [TWEAK]
+.endif
+ ret
+.endm
+
+ENTRY(speck128_xts_encrypt_neon)
+ _speck_xts_crypt n=64, lanes=2d, decrypting=0
+ENDPROC(speck128_xts_encrypt_neon)
+
+ENTRY(speck128_xts_decrypt_neon)
+ _speck_xts_crypt n=64, lanes=2d, decrypting=1
+ENDPROC(speck128_xts_decrypt_neon)
+
+ENTRY(speck64_xts_encrypt_neon)
+ _speck_xts_crypt n=32, lanes=4s, decrypting=0
+ENDPROC(speck64_xts_encrypt_neon)
+
+ENTRY(speck64_xts_decrypt_neon)
+ _speck_xts_crypt n=32, lanes=4s, decrypting=1
+ENDPROC(speck64_xts_decrypt_neon)
diff --git a/arch/arm64/crypto/speck-neon-glue.c b/arch/arm64/crypto/speck-neon-glue.c
new file mode 100644
index 0000000..c7d1856
--- /dev/null
+++ b/arch/arm64/crypto/speck-neon-glue.c
@@ -0,0 +1,308 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
+ * (64-bit version; based on the 32-bit version)
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <crypto/algapi.h>
+#include <crypto/gf128mul.h>
+#include <crypto/speck.h>
+#include <crypto/xts.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+/* The assembly functions only handle multiples of 128 bytes */
+#define SPECK_NEON_CHUNK_SIZE 128
+
+/* Speck128 */
+
+struct speck128_xts_tfm_ctx {
+ struct speck128_tfm_ctx main_key;
+ struct speck128_tfm_ctx tweak_key;
+};
+
+asmlinkage void speck128_xts_encrypt_neon(const u64 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+asmlinkage void speck128_xts_decrypt_neon(const u64 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+typedef void (*speck128_crypt_one_t)(const struct speck128_tfm_ctx *,
+ u8 *, const u8 *);
+typedef void (*speck128_xts_crypt_many_t)(const u64 *, int, void *,
+ const void *, unsigned int, void *);
+
+static __always_inline int
+__speck128_xts_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
+ struct scatterlist *src, unsigned int nbytes,
+ speck128_crypt_one_t crypt_one,
+ speck128_xts_crypt_many_t crypt_many)
+{
+ struct crypto_blkcipher *tfm = desc->tfm;
+ const struct speck128_xts_tfm_ctx *ctx = crypto_blkcipher_ctx(tfm);
+ struct blkcipher_walk walk;
+ le128 tweak;
+ int err;
+
+ blkcipher_walk_init(&walk, dst, src, nbytes);
+ err = blkcipher_walk_virt_block(desc, &walk, SPECK_NEON_CHUNK_SIZE);
+
+ crypto_speck128_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
+
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+ u8 *dst = walk.dst.virt.addr;
+ const u8 *src = walk.src.virt.addr;
+
+ if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
+ unsigned int count;
+
+ count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
+ kernel_neon_begin();
+ (*crypt_many)(ctx->main_key.round_keys,
+ ctx->main_key.nrounds,
+ dst, src, count, &tweak);
+ kernel_neon_end();
+ dst += count;
+ src += count;
+ nbytes -= count;
+ }
+
+ /* Handle any remainder with generic code */
+ while (nbytes >= sizeof(tweak)) {
+ le128_xor((le128 *)dst, (const le128 *)src, &tweak);
+ (*crypt_one)(&ctx->main_key, dst, dst);
+ le128_xor((le128 *)dst, (const le128 *)dst, &tweak);
+ gf128mul_x_ble((be128 *)&tweak, (const be128 *)&tweak);
+
+ dst += sizeof(tweak);
+ src += sizeof(tweak);
+ nbytes -= sizeof(tweak);
+ }
+ err = blkcipher_walk_done(desc, &walk, nbytes);
+ }
+
+ return err;
+}
+
+static int speck128_xts_encrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst,
+ struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck128_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck128_encrypt,
+ speck128_xts_encrypt_neon);
+}
+
+static int speck128_xts_decrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst,
+ struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck128_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck128_decrypt,
+ speck128_xts_decrypt_neon);
+}
+
+static int speck128_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ struct speck128_xts_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
+ int err;
+
+ if (keylen % 2)
+ return -EINVAL;
+
+ keylen /= 2;
+
+ err = crypto_speck128_setkey(&ctx->main_key, key, keylen);
+ if (err)
+ return err;
+
+ return crypto_speck128_setkey(&ctx->tweak_key, key + keylen, keylen);
+}
+
+/* Speck64 */
+
+struct speck64_xts_tfm_ctx {
+ struct speck64_tfm_ctx main_key;
+ struct speck64_tfm_ctx tweak_key;
+};
+
+asmlinkage void speck64_xts_encrypt_neon(const u32 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+asmlinkage void speck64_xts_decrypt_neon(const u32 *round_keys, int nrounds,
+ void *dst, const void *src,
+ unsigned int nbytes, void *tweak);
+
+typedef void (*speck64_crypt_one_t)(const struct speck64_tfm_ctx *,
+ u8 *, const u8 *);
+typedef void (*speck64_xts_crypt_many_t)(const u32 *, int, void *,
+ const void *, unsigned int, void *);
+
+static __always_inline int
+__speck64_xts_crypt(struct blkcipher_desc *desc, struct scatterlist *dst,
+ struct scatterlist *src, unsigned int nbytes,
+ speck64_crypt_one_t crypt_one,
+ speck64_xts_crypt_many_t crypt_many)
+{
+ struct crypto_blkcipher *tfm = desc->tfm;
+ const struct speck64_xts_tfm_ctx *ctx = crypto_blkcipher_ctx(tfm);
+ struct blkcipher_walk walk;
+ __le64 tweak;
+ int err;
+
+ blkcipher_walk_init(&walk, dst, src, nbytes);
+ err = blkcipher_walk_virt_block(desc, &walk, SPECK_NEON_CHUNK_SIZE);
+
+ crypto_speck64_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
+
+ while (walk.nbytes > 0) {
+ unsigned int nbytes = walk.nbytes;
+ u8 *dst = walk.dst.virt.addr;
+ const u8 *src = walk.src.virt.addr;
+
+ if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
+ unsigned int count;
+
+ count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
+ kernel_neon_begin();
+ (*crypt_many)(ctx->main_key.round_keys,
+ ctx->main_key.nrounds,
+ dst, src, count, &tweak);
+ kernel_neon_end();
+ dst += count;
+ src += count;
+ nbytes -= count;
+ }
+
+ /* Handle any remainder with generic code */
+ while (nbytes >= sizeof(tweak)) {
+ *(__le64 *)dst = *(__le64 *)src ^ tweak;
+ (*crypt_one)(&ctx->main_key, dst, dst);
+ *(__le64 *)dst ^= tweak;
+ tweak = cpu_to_le64((le64_to_cpu(tweak) << 1) ^
+ ((tweak & cpu_to_le64(1ULL << 63)) ?
+ 0x1B : 0));
+ dst += sizeof(tweak);
+ src += sizeof(tweak);
+ nbytes -= sizeof(tweak);
+ }
+ err = blkcipher_walk_done(desc, &walk, nbytes);
+ }
+
+ return err;
+}
+
+static int speck64_xts_encrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst, struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck64_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck64_encrypt,
+ speck64_xts_encrypt_neon);
+}
+
+static int speck64_xts_decrypt(struct blkcipher_desc *desc,
+ struct scatterlist *dst, struct scatterlist *src,
+ unsigned int nbytes)
+{
+ return __speck64_xts_crypt(desc, dst, src, nbytes,
+ crypto_speck64_decrypt,
+ speck64_xts_decrypt_neon);
+}
+
+static int speck64_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ struct speck64_xts_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
+ int err;
+
+ if (keylen % 2)
+ return -EINVAL;
+
+ keylen /= 2;
+
+ err = crypto_speck64_setkey(&ctx->main_key, key, keylen);
+ if (err)
+ return err;
+
+ return crypto_speck64_setkey(&ctx->tweak_key, key + keylen, keylen);
+}
+
+static struct crypto_alg speck_algs[] = {
+ {
+ .cra_name = "xts(speck128)",
+ .cra_driver_name = "xts-speck128-neon",
+ .cra_priority = 300,
+ .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
+ .cra_blocksize = SPECK128_BLOCK_SIZE,
+ .cra_type = &crypto_blkcipher_type,
+ .cra_ctxsize = sizeof(struct speck128_xts_tfm_ctx),
+ .cra_alignmask = 7,
+ .cra_module = THIS_MODULE,
+ .cra_u = {
+ .blkcipher = {
+ .min_keysize = 2 * SPECK128_128_KEY_SIZE,
+ .max_keysize = 2 * SPECK128_256_KEY_SIZE,
+ .ivsize = SPECK128_BLOCK_SIZE,
+ .setkey = speck128_xts_setkey,
+ .encrypt = speck128_xts_encrypt,
+ .decrypt = speck128_xts_decrypt,
+ }
+ }
+ }, {
+ .cra_name = "xts(speck64)",
+ .cra_driver_name = "xts-speck64-neon",
+ .cra_priority = 300,
+ .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
+ .cra_blocksize = SPECK64_BLOCK_SIZE,
+ .cra_type = &crypto_blkcipher_type,
+ .cra_ctxsize = sizeof(struct speck64_xts_tfm_ctx),
+ .cra_alignmask = 7,
+ .cra_module = THIS_MODULE,
+ .cra_u = {
+ .blkcipher = {
+ .min_keysize = 2 * SPECK64_96_KEY_SIZE,
+ .max_keysize = 2 * SPECK64_128_KEY_SIZE,
+ .ivsize = SPECK64_BLOCK_SIZE,
+ .setkey = speck64_xts_setkey,
+ .encrypt = speck64_xts_encrypt,
+ .decrypt = speck64_xts_decrypt,
+ }
+ }
+ }
+};
+
+static int __init speck_neon_module_init(void)
+{
+ if (!(elf_hwcap & HWCAP_ASIMD))
+ return -ENODEV;
+ return crypto_register_algs(speck_algs, ARRAY_SIZE(speck_algs));
+}
+
+static void __exit speck_neon_module_exit(void)
+{
+ crypto_unregister_algs(speck_algs, ARRAY_SIZE(speck_algs));
+}
+
+module_init(speck_neon_module_init);
+module_exit(speck_neon_module_exit);
+
+MODULE_DESCRIPTION("Speck block cipher (NEON-accelerated)");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+MODULE_ALIAS_CRYPTO("xts(speck128)");
+MODULE_ALIAS_CRYPTO("xts-speck128-neon");
+MODULE_ALIAS_CRYPTO("xts(speck64)");
+MODULE_ALIAS_CRYPTO("xts-speck64-neon");
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 40d1351e..0a11cd5 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -87,9 +87,26 @@
static inline void acpi_init_cpus(void) { }
#endif /* CONFIG_ACPI */
+#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
+bool acpi_parking_protocol_valid(int cpu);
+void __init
+acpi_set_mailbox_entry(int cpu, struct acpi_madt_generic_interrupt *processor);
+#else
+static inline bool acpi_parking_protocol_valid(int cpu) { return false; }
+static inline void
+acpi_set_mailbox_entry(int cpu, struct acpi_madt_generic_interrupt *processor)
+{}
+#endif
+
static inline const char *acpi_get_enable_method(int cpu)
{
- return acpi_psci_present() ? "psci" : NULL;
+ if (acpi_psci_present())
+ return "psci";
+
+ if (acpi_parking_protocol_valid(cpu))
+ return "parking-protocol";
+
+ return NULL;
}
#ifdef CONFIG_ACPI_APEI
diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index d56ec07..55101bd 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -1,6 +1,9 @@
#ifndef __ASM_ALTERNATIVE_H
#define __ASM_ALTERNATIVE_H
+#include <asm/cpufeature.h>
+#include <asm/insn.h>
+
#ifndef __ASSEMBLY__
#include <linux/init.h>
@@ -19,7 +22,6 @@
void __init apply_alternatives_all(void);
void apply_alternatives(void *start, size_t length);
-void free_alternatives_memory(void);
#define ALTINSTR_ENTRY(feature) \
" .word 661b - .\n" /* label */ \
@@ -64,6 +66,8 @@
#else
+#include <asm/assembler.h>
+
.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
.word \orig_offset - .
.word \alt_offset - .
@@ -87,26 +91,15 @@
.endm
/*
- * Begin an alternative code sequence.
+ * Alternative sequences
*
- * The code that follows this macro will be assembled and linked as
- * normal. There are no restrictions on this code.
- */
-.macro alternative_if_not cap, enable = 1
- .if \enable
- .pushsection .altinstructions, "a"
- altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
- .popsection
-661:
- .endif
-.endm
-
-/*
- * Provide the alternative code sequence.
+ * The code for the case where the capability is not present will be
+ * assembled and linked as normal. There are no restrictions on this
+ * code.
*
- * The code that follows this macro is assembled into a special
- * section to be used for dynamic patching. Code that follows this
- * macro must:
+ * The code for the case where the capability is present will be
+ * assembled into a special section to be used for dynamic patching.
+ * Code for that case must:
*
* 1. Be exactly the same length (in bytes) as the default code
* sequence.
@@ -115,27 +108,130 @@
* alternative sequence it is defined in (branches into an
* alternative sequence are not fixed up).
*/
-.macro alternative_else, enable = 1
- .if \enable
-662: .pushsection .altinstr_replacement, "ax"
-663:
+
+/*
+ * Begin an alternative code sequence.
+ */
+.macro alternative_if_not cap
+ .set .Lasm_alt_mode, 0
+ .pushsection .altinstructions, "a"
+ altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
+ .popsection
+661:
+.endm
+
+.macro alternative_if cap
+ .set .Lasm_alt_mode, 1
+ .pushsection .altinstructions, "a"
+ altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
+ .popsection
+ .pushsection .altinstr_replacement, "ax"
+ .align 2 /* So GAS knows label 661 is suitably aligned */
+661:
+.endm
+
+/*
+ * Provide the other half of the alternative code sequence.
+ */
+.macro alternative_else
+662:
+ .if .Lasm_alt_mode==0
+ .pushsection .altinstr_replacement, "ax"
+ .else
+ .popsection
.endif
+663:
.endm
/*
* Complete an alternative code sequence.
*/
-.macro alternative_endif, enable = 1
- .if \enable
-664: .popsection
+.macro alternative_endif
+664:
+ .if .Lasm_alt_mode==0
+ .popsection
+ .endif
.org . - (664b-663b) + (662b-661b)
.org . - (662b-661b) + (664b-663b)
- .endif
+.endm
+
+/*
+ * Provides a trivial alternative or default sequence consisting solely
+ * of NOPs. The number of NOPs is chosen automatically to match the
+ * previous case.
+ */
+.macro alternative_else_nop_endif
+alternative_else
+ nops (662b-661b) / AARCH64_INSN_SIZE
+alternative_endif
.endm
#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...) \
alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
+.macro user_alt, label, oldinstr, newinstr, cond
+9999: alternative_insn "\oldinstr", "\newinstr", \cond
+ _ASM_EXTABLE 9999b, \label
+.endm
+
+/*
+ * Generate the assembly for UAO alternatives with exception table entries.
+ * This is complicated as there is no post-increment or pair versions of the
+ * unprivileged instructions, and USER() only works for single instructions.
+ */
+#ifdef CONFIG_ARM64_UAO
+ .macro uao_ldp l, reg1, reg2, addr, post_inc
+ alternative_if_not ARM64_HAS_UAO
+8888: ldp \reg1, \reg2, [\addr], \post_inc;
+8889: nop;
+ nop;
+ alternative_else
+ ldtr \reg1, [\addr];
+ ldtr \reg2, [\addr, #8];
+ add \addr, \addr, \post_inc;
+ alternative_endif
+
+ _asm_extable 8888b,\l;
+ _asm_extable 8889b,\l;
+ .endm
+
+ .macro uao_stp l, reg1, reg2, addr, post_inc
+ alternative_if_not ARM64_HAS_UAO
+8888: stp \reg1, \reg2, [\addr], \post_inc;
+8889: nop;
+ nop;
+ alternative_else
+ sttr \reg1, [\addr];
+ sttr \reg2, [\addr, #8];
+ add \addr, \addr, \post_inc;
+ alternative_endif
+
+ _asm_extable 8888b,\l;
+ _asm_extable 8889b,\l;
+ .endm
+
+ .macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
+ alternative_if_not ARM64_HAS_UAO
+8888: \inst \reg, [\addr], \post_inc;
+ nop;
+ alternative_else
+ \alt_inst \reg, [\addr];
+ add \addr, \addr, \post_inc;
+ alternative_endif
+
+ _asm_extable 8888b,\l;
+ .endm
+#else
+ .macro uao_ldp l, reg1, reg2, addr, post_inc
+ USER(\l, ldp \reg1, \reg2, [\addr], \post_inc)
+ .endm
+ .macro uao_stp l, reg1, reg2, addr, post_inc
+ USER(\l, stp \reg1, \reg2, [\addr], \post_inc)
+ .endm
+ .macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
+ USER(\l, \inst \reg, [\addr], \post_inc)
+ .endm
+#endif
#endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index f68abb1..8dade93 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
/*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
*
* Copyright (C) 1996-2000 Russell King
* Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,10 @@
#ifndef __ASM_ASSEMBLER_H
#define __ASM_ASSEMBLER_H
+#include <asm/asm-offsets.h>
+#include <asm/cpufeature.h>
+#include <asm/page.h>
+#include <asm/pgtable-hwdef.h>
#include <asm/cputype.h>
#include <asm/ptrace.h>
#include <asm/thread_info.h>
@@ -50,6 +54,15 @@
msr daifclr, #2
.endm
+ .macro save_and_disable_irq, flags
+ mrs \flags, daif
+ msr daifset, #2
+ .endm
+
+ .macro restore_irq, flags
+ msr daif, \flags
+ .endm
+
/*
* Enable and disable debug exceptions.
*/
@@ -95,12 +108,28 @@
dmb \opt
.endm
+/*
+ * NOP sequence
+ */
+ .macro nops, num
+ .rept \num
+ nop
+ .endr
+ .endm
+
+/*
+ * Emit an entry into the exception table
+ */
+ .macro _asm_extable, from, to
+ .pushsection __ex_table, "a"
+ .align 3
+ .long (\from - .), (\to - .)
+ .popsection
+ .endm
+
#define USER(l, x...) \
9999: x; \
- .section __ex_table,"a"; \
- .align 3; \
- .quad 9999b,l; \
- .previous
+ _asm_extable 9999b, l
/*
* Register aliases.
@@ -194,6 +223,133 @@
str \src, [\tmp, :lo12:\sym]
.endm
+ /*
+ * @dst: Result of per_cpu(sym, smp_processor_id())
+ * @sym: The name of the per-cpu variable
+ * @tmp: scratch register
+ */
+ .macro adr_this_cpu, dst, sym, tmp
+ adr_l \dst, \sym
+ mrs \tmp, tpidr_el1
+ add \dst, \dst, \tmp
+ .endm
+
+ /*
+ * @dst: Result of READ_ONCE(per_cpu(sym, smp_processor_id()))
+ * @sym: The name of the per-cpu variable
+ * @tmp: scratch register
+ */
+ .macro ldr_this_cpu dst, sym, tmp
+ adr_l \dst, \sym
+ mrs \tmp, tpidr_el1
+ ldr \dst, [\dst, \tmp]
+ .endm
+
+/*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+ .macro vma_vm_mm, rd, rn
+ ldr \rd, [\rn, #VMA_VM_MM]
+ .endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+ .macro mmid, rd, rn
+ ldr \rd, [\rn, #MM_CONTEXT_ID]
+ .endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+ .macro dcache_line_size, reg, tmp
+ mrs \tmp, ctr_el0 // read CTR
+ ubfm \tmp, \tmp, #16, #19 // cache line size encoding
+ mov \reg, #4 // bytes per word
+ lsl \reg, \reg, \tmp // actual cache line size
+ .endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+ .macro icache_line_size, reg, tmp
+ mrs \tmp, ctr_el0 // read CTR
+ and \tmp, \tmp, #0xf // cache line size encoding
+ mov \reg, #4 // bytes per word
+ lsl \reg, \reg, \tmp // actual cache line size
+ .endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+ .macro tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+ ldr_l \tmpreg, idmap_t0sz
+ bfi \valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+ .endm
+
+/*
+ * Macro to perform a data cache maintenance for the interval
+ * [kaddr, kaddr + size)
+ *
+ * op: operation passed to dc instruction
+ * domain: domain used in dsb instruciton
+ * kaddr: starting virtual address of the region
+ * size: size of the region
+ * Corrupts: kaddr, size, tmp1, tmp2
+ */
+ .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+ dcache_line_size \tmp1, \tmp2
+ add \size, \kaddr, \size
+ sub \tmp2, \tmp1, #1
+ bic \kaddr, \kaddr, \tmp2
+9998:
+ .if (\op == cvau || \op == cvac)
+alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE
+ dc \op, \kaddr
+alternative_else
+ dc civac, \kaddr
+alternative_endif
+ .else
+ dc \op, \kaddr
+ .endif
+ add \kaddr, \kaddr, \tmp1
+ cmp \kaddr, \size
+ b.lo 9998b
+ dsb \domain
+ .endm
+
+/*
+ * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
+ */
+ .macro reset_pmuserenr_el0, tmpreg
+ mrs \tmpreg, id_aa64dfr0_el1 // Check ID_AA64DFR0_EL1 PMUVer
+ sbfx \tmpreg, \tmpreg, #8, #4
+ cmp \tmpreg, #1 // Skip if no PMU present
+ b.lt 9000f
+ msr pmuserenr_el0, xzr // Disable PMU access from EL0
+9000:
+ .endm
+
+/*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+ .macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+9998: ldp \t1, \t2, [\src]
+ ldp \t3, \t4, [\src, #16]
+ ldp \t5, \t6, [\src, #32]
+ ldp \t7, \t8, [\src, #48]
+ add \src, \src, #64
+ stnp \t1, \t2, [\dest]
+ stnp \t3, \t4, [\dest, #16]
+ stnp \t5, \t6, [\dest, #32]
+ stnp \t7, \t8, [\dest, #48]
+ add \dest, \dest, #64
+ tst \src, #(PAGE_SIZE - 1)
+ b.ne 9998b
+ .endm
+
/*
* Annotate a function as position independent, i.e., safe to be called before
* the kernel virtual mapping is activated.
@@ -206,6 +362,17 @@
ENDPROC(x)
/*
+ * Emit a 64-bit absolute little endian symbol reference in a way that
+ * ensures that it will be resolved at build time, even when building a
+ * PIE binary. This requires cooperation from the linker script, which
+ * must emit the lo32/hi32 halves individually.
+ */
+ .macro le64sym, sym
+ .long \sym\()_lo32
+ .long \sym\()_hi32
+ .endm
+
+ /*
* mov_q - move an immediate constant into a 64-bit register using
* between 2 and 4 movz/movk instructions (depending on the
* magnitude and sign of the operand)
@@ -226,6 +393,13 @@
.endm
/*
+ * Return the current thread_info.
+ */
+ .macro get_thread_info, rd
+ mrs \rd, sp_el0
+ .endm
+
+/*
* Check the MIDR_EL1 of the current CPU for a given model and a range of
* variant/revision. See asm/cputype.h for the macros used below.
*
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index e3438c6..a000e47 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -36,7 +36,7 @@
" stclr %w[i], %[v]\n")
: [i] "+r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic_or(int i, atomic_t *v)
@@ -48,7 +48,7 @@
" stset %w[i], %[v]\n")
: [i] "+r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic_xor(int i, atomic_t *v)
@@ -60,7 +60,7 @@
" steor %w[i], %[v]\n")
: [i] "+r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic_add(int i, atomic_t *v)
@@ -72,7 +72,7 @@
" stadd %w[i], %[v]\n")
: [i] "+r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
#define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \
@@ -90,7 +90,7 @@
" add %w[i], %w[i], w30") \
: [i] "+r" (w0), [v] "+Q" (v->counter) \
: "r" (x1) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS, ##cl); \
\
return w0; \
}
@@ -116,7 +116,7 @@
" stclr %w[i], %[v]")
: [i] "+&r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic_sub(int i, atomic_t *v)
@@ -133,7 +133,7 @@
" stadd %w[i], %[v]")
: [i] "+&r" (w0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
#define ATOMIC_OP_SUB_RETURN(name, mb, cl...) \
@@ -153,7 +153,7 @@
" add %w[i], %w[i], w30") \
: [i] "+&r" (w0), [v] "+Q" (v->counter) \
: "r" (x1) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS , ##cl); \
\
return w0; \
}
@@ -177,7 +177,7 @@
" stclr %[i], %[v]\n")
: [i] "+r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic64_or(long i, atomic64_t *v)
@@ -189,7 +189,7 @@
" stset %[i], %[v]\n")
: [i] "+r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic64_xor(long i, atomic64_t *v)
@@ -201,7 +201,7 @@
" steor %[i], %[v]\n")
: [i] "+r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic64_add(long i, atomic64_t *v)
@@ -213,7 +213,7 @@
" stadd %[i], %[v]\n")
: [i] "+r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
#define ATOMIC64_OP_ADD_RETURN(name, mb, cl...) \
@@ -231,7 +231,7 @@
" add %[i], %[i], x30") \
: [i] "+r" (x0), [v] "+Q" (v->counter) \
: "r" (x1) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS, ##cl); \
\
return x0; \
}
@@ -257,7 +257,7 @@
" stclr %[i], %[v]")
: [i] "+&r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
static inline void atomic64_sub(long i, atomic64_t *v)
@@ -274,7 +274,7 @@
" stadd %[i], %[v]")
: [i] "+&r" (x0), [v] "+Q" (v->counter)
: "r" (x1)
- : "x30");
+ : __LL_SC_CLOBBERS);
}
#define ATOMIC64_OP_SUB_RETURN(name, mb, cl...) \
@@ -294,7 +294,7 @@
" add %[i], %[i], x30") \
: [i] "+&r" (x0), [v] "+Q" (v->counter) \
: "r" (x1) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS, ##cl); \
\
return x0; \
}
@@ -330,7 +330,7 @@
"2:")
: [ret] "+&r" (x0), [v] "+Q" (v->counter)
:
- : "x30", "cc", "memory");
+ : __LL_SC_CLOBBERS, "cc", "memory");
return x0;
}
@@ -359,7 +359,7 @@
" mov %" #w "[ret], " #w "30") \
: [ret] "+r" (x0), [v] "+Q" (*(unsigned long *)ptr) \
: [old] "r" (x1), [new] "r" (x2) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS, ##cl); \
\
return x0; \
}
@@ -416,7 +416,7 @@
[v] "+Q" (*(unsigned long *)ptr) \
: [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \
[oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \
- : "x30" , ##cl); \
+ : __LL_SC_CLOBBERS, ##cl); \
\
return x0; \
}
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f2d2c0b..0671711 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -20,6 +20,9 @@
#ifndef __ASSEMBLY__
+#define __nops(n) ".rept " #n "\nnop\n.endr\n"
+#define nops(n) asm volatile(__nops(n))
+
#define sev() asm volatile("sev" : : : "memory")
#define wfe() asm volatile("wfe" : : : "memory")
#define wfi() asm volatile("wfi" : : : "memory")
diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 81151b6..ebf2481 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -11,4 +11,10 @@
#define MIN_FDT_ALIGN 8
#define MAX_FDT_SIZE SZ_2M
+/*
+ * arm64 requires the kernel image to placed
+ * TEXT_OFFSET bytes beyond a 2 MB aligned base
+ */
+#define MIN_KIMG_ALIGN SZ_2M
+
#endif
diff --git a/arch/arm64/include/asm/brk-imm.h b/arch/arm64/include/asm/brk-imm.h
new file mode 100644
index 0000000..ed693c5
--- /dev/null
+++ b/arch/arm64/include/asm/brk-imm.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __ASM_BRK_IMM_H
+#define __ASM_BRK_IMM_H
+
+/*
+ * #imm16 values used for BRK instruction generation
+ * Allowed values for kgdb are 0x400 - 0x7ff
+ * 0x100: for triggering a fault on purpose (reserved)
+ * 0x400: for dynamic BRK instruction
+ * 0x401: for compile time BRK instruction
+ * 0x800: kernel-mode BUG() and WARN() traps
+ */
+#define FAULT_BRK_IMM 0x100
+#define KGDB_DYN_DBG_BRK_IMM 0x400
+#define KGDB_COMPILED_DBG_BRK_IMM 0x401
+#define BUG_BRK_IMM 0x800
+
+#endif
diff --git a/arch/arm64/include/asm/bug.h b/arch/arm64/include/asm/bug.h
index ac6382b..0bfe1df1 100644
--- a/arch/arm64/include/asm/bug.h
+++ b/arch/arm64/include/asm/bug.h
@@ -18,7 +18,7 @@
#ifndef _ARCH_ARM64_ASM_BUG_H
#define _ARCH_ARM64_ASM_BUG_H
-#include <asm/debug-monitors.h>
+#include <asm/brk-imm.h>
#ifdef CONFIG_DEBUG_BUGVERBOSE
#define _BUGVERBOSE_LOCATION(file, line) __BUGVERBOSE_LOCATION(file, line)
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 54efeda..22dda61 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -68,6 +68,7 @@
extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
extern void flush_icache_range(unsigned long start, unsigned long end);
extern void __flush_dcache_area(void *addr, size_t len);
+extern void __clean_dcache_area_pou(void *addr, size_t len);
extern long __flush_cache_user_range(unsigned long start, unsigned long end);
static inline void flush_cache_mm(struct mm_struct *mm)
@@ -155,8 +156,4 @@
int set_memory_x(unsigned long addr, int numpages);
int set_memory_nx(unsigned long addr, int numpages);
-#ifdef CONFIG_DEBUG_RODATA
-void mark_rodata_ro(void);
-#endif
-
#endif
diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index 91ceeb7..270c6b7 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -19,7 +19,6 @@
#define __ASM_CMPXCHG_H
#include <linux/bug.h>
-#include <linux/mmdebug.h>
#include <asm/atomic.h>
#include <asm/barrier.h>
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index b5e9cee..13a6103 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -36,6 +36,7 @@
u64 reg_id_aa64isar1;
u64 reg_id_aa64mmfr0;
u64 reg_id_aa64mmfr1;
+ u64 reg_id_aa64mmfr2;
u64 reg_id_aa64pfr0;
u64 reg_id_aa64pfr1;
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 8884b5d..2c6497a 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -30,9 +30,14 @@
#define ARM64_HAS_LSE_ATOMICS 5
#define ARM64_WORKAROUND_CAVIUM_23154 6
#define ARM64_WORKAROUND_834220 7
-#define ARM64_WORKAROUND_CAVIUM_27456 8
+#define ARM64_HAS_NO_HW_PREFETCH 8
+#define ARM64_HAS_UAO 9
+#define ARM64_ALT_PAN_NOT_UAO 10
+#define ARM64_HAS_VIRT_HOST_EXTN 11
+#define ARM64_WORKAROUND_CAVIUM_27456 12
+#define ARM64_UNMAP_KERNEL_AT_EL0 23
-#define ARM64_NCAPS 9
+#define ARM64_NCAPS 24
#ifndef __ASSEMBLY__
@@ -177,7 +182,7 @@
static inline bool cpu_supports_mixed_endian_el0(void)
{
- return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1));
+ return id_aa64mmfr0_mixed_endian_el0(read_cpuid(SYS_ID_AA64MMFR0_EL1));
}
static inline bool system_supports_mixed_endian_el0(void)
@@ -185,6 +190,12 @@
return id_aa64mmfr0_mixed_endian_el0(read_system_reg(SYS_ID_AA64MMFR0_EL1));
}
+static inline bool system_uses_ttbr0_pan(void)
+{
+ return IS_ENABLED(CONFIG_ARM64_SW_TTBR0_PAN) &&
+ !cpus_have_cap(ARM64_HAS_PAN);
+}
+
#endif /* __ASSEMBLY__ */
#endif
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index f43e10c..0a23fb0 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -32,12 +32,6 @@
#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
((mpidr >> MPIDR_LEVEL_SHIFT(level)) & MPIDR_LEVEL_MASK)
-#define read_cpuid(reg) ({ \
- u64 __val; \
- asm("mrs %0, " #reg : "=r" (__val)); \
- __val; \
-})
-
#define MIDR_REVISION_MASK 0xf
#define MIDR_REVISION(midr) ((midr) & MIDR_REVISION_MASK)
#define MIDR_PARTNUM_SHIFT 4
@@ -65,11 +59,22 @@
MIDR_ARCHITECTURE_MASK | \
MIDR_PARTNUM_MASK)
-#define MIDR_CPU_PART(imp, partnum) \
+#define MIDR_CPU_MODEL(imp, partnum) \
(((imp) << MIDR_IMPLEMENTOR_SHIFT) | \
(0xf << MIDR_ARCHITECTURE_SHIFT) | \
((partnum) << MIDR_PARTNUM_SHIFT))
+#define MIDR_CPU_MODEL_MASK (MIDR_IMPLEMENTOR_MASK | MIDR_PARTNUM_MASK | \
+ MIDR_ARCHITECTURE_MASK)
+
+#define MIDR_IS_CPU_MODEL_RANGE(midr, model, rv_min, rv_max) \
+({ \
+ u32 _model = (midr) & MIDR_CPU_MODEL_MASK; \
+ u32 rv = (midr) & (MIDR_REVISION_MASK | MIDR_VARIANT_MASK); \
+ \
+ _model == (model) && rv >= (rv_min) && rv <= (rv_max); \
+ })
+
#define ARM_CPU_IMP_ARM 0x41
#define ARM_CPU_IMP_APM 0x50
#define ARM_CPU_IMP_CAVIUM 0x43
@@ -84,10 +89,21 @@
#define CAVIUM_CPU_PART_THUNDERX 0x0A1
-#define MIDR_CORTEX_A55 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55)
+#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
+#define MIDR_CORTEX_A55 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55)
+#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
+#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
#ifndef __ASSEMBLY__
+#include <asm/sysreg.h>
+
+#define read_cpuid(reg) ({ \
+ u64 __val; \
+ asm("mrs_s %0, " __stringify(reg) : "=r" (__val)); \
+ __val; \
+})
+
/*
* The CPU ID never changes at run time, so we might as well tell the
* compiler that it's constant. Use this function to read the CPU ID
@@ -95,12 +111,12 @@
*/
static inline u32 __attribute_const__ read_cpuid_id(void)
{
- return read_cpuid(MIDR_EL1);
+ return read_cpuid(SYS_MIDR_EL1);
}
static inline u64 __attribute_const__ read_cpuid_mpidr(void)
{
- return read_cpuid(MPIDR_EL1);
+ return read_cpuid(SYS_MPIDR_EL1);
}
static inline unsigned int __attribute_const__ read_cpuid_implementor(void)
@@ -115,7 +131,7 @@
static inline u32 __attribute_const__ read_cpuid_cachetype(void)
{
- return read_cpuid(CTR_EL0);
+ return read_cpuid(SYS_CTR_EL0);
}
#endif /* __ASSEMBLY__ */
diff --git a/arch/arm64/include/asm/current.h b/arch/arm64/include/asm/current.h
new file mode 100644
index 0000000..483a6c9d3
--- /dev/null
+++ b/arch/arm64/include/asm/current.h
@@ -0,0 +1,35 @@
+#ifndef __ASM_CURRENT_H
+#define __ASM_CURRENT_H
+
+#include <linux/compiler.h>
+
+#include <asm/sysreg.h>
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+struct task_struct;
+
+/*
+ * We don't use read_sysreg() as we want the compiler to cache the value where
+ * possible.
+ */
+static __always_inline struct task_struct *get_current(void)
+{
+ unsigned long sp_el0;
+
+ asm ("mrs %0, sp_el0" : "=r" (sp_el0));
+
+ return (struct task_struct *)sp_el0;
+}
+#define current get_current()
+#else
+#include <linux/thread_info.h>
+#define get_current() (current_thread_info()->task)
+#define current get_current()
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_CURRENT_H */
+
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 279c85b5..2fcb9b7 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -20,6 +20,7 @@
#include <linux/errno.h>
#include <linux/types.h>
+#include <asm/brk-imm.h>
#include <asm/esr.h>
#include <asm/insn.h>
#include <asm/ptrace.h>
@@ -47,19 +48,6 @@
#define BREAK_INSTR_SIZE AARCH64_INSN_SIZE
/*
- * #imm16 values used for BRK instruction generation
- * Allowed values for kgbd are 0x400 - 0x7ff
- * 0x100: for triggering a fault on purpose (reserved)
- * 0x400: for dynamic BRK instruction
- * 0x401: for compile time BRK instruction
- * 0x800: kernel-mode BUG() and WARN() traps
- */
-#define FAULT_BRK_IMM 0x100
-#define KGDB_DYN_DBG_BRK_IMM 0x400
-#define KGDB_COMPILED_DBG_BRK_IMM 0x401
-#define BUG_BRK_IMM 0x800
-
-/*
* BRK instruction encoding
* The #imm16 value should be placed at bits[20:5] within BRK ins
*/
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index ef57220..48e317e 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -1,8 +1,11 @@
#ifndef _ASM_EFI_H
#define _ASM_EFI_H
+#include <asm/cpufeature.h>
#include <asm/io.h>
+#include <asm/mmu_context.h>
#include <asm/neon.h>
+#include <asm/tlbflush.h>
#ifdef CONFIG_EFI
extern void efi_init(void);
@@ -10,6 +13,8 @@
#define efi_init()
#endif
+int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
+
#define efi_call_virt(f, ...) \
({ \
efi_##f##_t *__f; \
@@ -63,6 +68,36 @@
* Services are enabled and the EFI_RUNTIME_SERVICES bit set.
*/
+static inline void efi_set_pgd(struct mm_struct *mm)
+{
+ __switch_mm(mm);
+
+ if (system_uses_ttbr0_pan()) {
+ if (mm != current->active_mm) {
+ /*
+ * Update the current thread's saved ttbr0 since it is
+ * restored as part of a return from exception. Enable
+ * access to the valid TTBR0_EL1 and invoke the errata
+ * workaround directly since there is no return from
+ * exception when invoking the EFI run-time services.
+ */
+ update_saved_ttbr0(current, mm);
+ uaccess_ttbr0_enable();
+ post_ttbr_update_workaround();
+ } else {
+ /*
+ * Defer the switch to the current thread's TTBR0_EL1
+ * until uaccess_enable(). Restore the current
+ * thread's saved ttbr0 corresponding to its active_mm
+ * (if different from init_mm).
+ */
+ uaccess_ttbr0_disable();
+ if (current->active_mm != &init_mm)
+ update_saved_ttbr0(current, current->active_mm);
+ }
+ }
+}
+
void efi_virtmap_load(void);
void efi_virtmap_unload(void);
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 329c127..f9d64ed 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -24,15 +24,6 @@
#include <asm/ptrace.h>
#include <asm/user.h>
-typedef unsigned long elf_greg_t;
-
-#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
-#define ELF_CORE_COPY_REGS(dest, regs) \
- *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
-
-typedef elf_greg_t elf_gregset_t[ELF_NGREG];
-typedef struct user_fpsimd_state elf_fpregset_t;
-
/*
* AArch64 static relocation types.
*/
@@ -86,6 +77,8 @@
#define R_AARCH64_MOVW_PREL_G2_NC 292
#define R_AARCH64_MOVW_PREL_G3 293
+#define R_AARCH64_RELATIVE 1027
+
/*
* These are used to set parameters in the core dumps.
*/
@@ -126,6 +119,17 @@
*/
#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
+#ifndef __ASSEMBLY__
+
+typedef unsigned long elf_greg_t;
+
+#define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
+#define ELF_CORE_COPY_REGS(dest, regs) \
+ *(struct user_pt_regs *)&(dest) = (regs)->user_regs;
+
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+typedef struct user_fpsimd_state elf_fpregset_t;
+
/*
* When the program starts, a1 contains a pointer to a function to be
* registered with atexit, as per the SVR4 ABI. A value of 0 means we have no
@@ -165,7 +169,7 @@
#ifdef CONFIG_COMPAT
/* PIE load location for compat arm. Must match ARM ELF_ET_DYN_BASE. */
-#define COMPAT_ELF_ET_DYN_BASE 0x000400000UL
+#define COMPAT_ELF_ET_DYN_BASE (2 * TASK_SIZE_32 / 3)
/* AArch32 registers. */
#define COMPAT_ELF_NGREG 18
@@ -187,4 +191,6 @@
#endif /* CONFIG_COMPAT */
+#endif /* !__ASSEMBLY__ */
+
#endif
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 77eeb2c..2d4e9c2 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -74,6 +74,7 @@
#define ESR_ELx_EC_SHIFT (26)
#define ESR_ELx_EC_MASK (UL(0x3F) << ESR_ELx_EC_SHIFT)
+#define ESR_ELx_EC(esr) (((esr) & ESR_ELx_EC_MASK) >> ESR_ELx_EC_SHIFT)
#define ESR_ELx_IL (UL(1) << 25)
#define ESR_ELx_ISS_MASK (ESR_ELx_IL - 1)
@@ -108,6 +109,46 @@
((ESR_ELx_EC_BRK64 << ESR_ELx_EC_SHIFT) | ESR_ELx_IL | \
((imm) & 0xffff))
+/* ISS field definitions for System instruction traps */
+#define ESR_ELx_SYS64_ISS_RES0_SHIFT 22
+#define ESR_ELx_SYS64_ISS_RES0_MASK (UL(0x7) << ESR_ELx_SYS64_ISS_RES0_SHIFT)
+#define ESR_ELx_SYS64_ISS_DIR_MASK 0x1
+#define ESR_ELx_SYS64_ISS_DIR_READ 0x1
+#define ESR_ELx_SYS64_ISS_DIR_WRITE 0x0
+
+#define ESR_ELx_SYS64_ISS_RT_SHIFT 5
+#define ESR_ELx_SYS64_ISS_RT_MASK (UL(0x1f) << ESR_ELx_SYS64_ISS_RT_SHIFT)
+#define ESR_ELx_SYS64_ISS_CRM_SHIFT 1
+#define ESR_ELx_SYS64_ISS_CRM_MASK (UL(0xf) << ESR_ELx_SYS64_ISS_CRM_SHIFT)
+#define ESR_ELx_SYS64_ISS_CRN_SHIFT 10
+#define ESR_ELx_SYS64_ISS_CRN_MASK (UL(0xf) << ESR_ELx_SYS64_ISS_CRN_SHIFT)
+#define ESR_ELx_SYS64_ISS_OP1_SHIFT 14
+#define ESR_ELx_SYS64_ISS_OP1_MASK (UL(0x7) << ESR_ELx_SYS64_ISS_OP1_SHIFT)
+#define ESR_ELx_SYS64_ISS_OP2_SHIFT 17
+#define ESR_ELx_SYS64_ISS_OP2_MASK (UL(0x7) << ESR_ELx_SYS64_ISS_OP2_SHIFT)
+#define ESR_ELx_SYS64_ISS_OP0_SHIFT 20
+#define ESR_ELx_SYS64_ISS_OP0_MASK (UL(0x3) << ESR_ELx_SYS64_ISS_OP0_SHIFT)
+#define ESR_ELx_SYS64_ISS_SYS_MASK (ESR_ELx_SYS64_ISS_OP0_MASK | \
+ ESR_ELx_SYS64_ISS_OP1_MASK | \
+ ESR_ELx_SYS64_ISS_OP2_MASK | \
+ ESR_ELx_SYS64_ISS_CRN_MASK | \
+ ESR_ELx_SYS64_ISS_CRM_MASK)
+#define ESR_ELx_SYS64_ISS_SYS_VAL(op0, op1, op2, crn, crm) \
+ (((op0) << ESR_ELx_SYS64_ISS_OP0_SHIFT) | \
+ ((op1) << ESR_ELx_SYS64_ISS_OP1_SHIFT) | \
+ ((op2) << ESR_ELx_SYS64_ISS_OP2_SHIFT) | \
+ ((crn) << ESR_ELx_SYS64_ISS_CRN_SHIFT) | \
+ ((crm) << ESR_ELx_SYS64_ISS_CRM_SHIFT))
+
+#define ESR_ELx_SYS64_ISS_SYS_OP_MASK (ESR_ELx_SYS64_ISS_SYS_MASK | \
+ ESR_ELx_SYS64_ISS_DIR_MASK)
+
+#define ESR_ELx_SYS64_ISS_SYS_CNTVCT (ESR_ELx_SYS64_ISS_SYS_VAL(3, 3, 2, 14, 0) | \
+ ESR_ELx_SYS64_ISS_DIR_READ)
+
+#define ESR_ELx_SYS64_ISS_SYS_CNTFRQ (ESR_ELx_SYS64_ISS_SYS_VAL(3, 3, 0, 14, 0) | \
+ ESR_ELx_SYS64_ISS_DIR_READ)
+
#ifndef __ASSEMBLY__
#include <asm/types.h>
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 6cb7e1a..0c2eec4 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,7 +18,7 @@
#ifndef __ASM_EXCEPTION_H
#define __ASM_EXCEPTION_H
-#include <linux/ftrace.h>
+#include <linux/interrupt.h>
#define __exception __attribute__((section(".exception.text")))
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/arch/arm64/include/asm/exec.h b/arch/arm64/include/asm/exec.h
index db0563c..f7865dd 100644
--- a/arch/arm64/include/asm/exec.h
+++ b/arch/arm64/include/asm/exec.h
@@ -18,6 +18,9 @@
#ifndef __ASM_EXEC_H
#define __ASM_EXEC_H
+#include <linux/sched.h>
+
extern unsigned long arch_align_stack(unsigned long sp);
+void uao_thread_switch(struct task_struct *next);
#endif /* __ASM_EXEC_H */
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 3097045..03a1e90 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -50,6 +50,11 @@
FIX_EARLYCON_MEM_BASE,
FIX_TEXT_POKE0,
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ FIX_ENTRY_TRAMP_DATA,
+ FIX_ENTRY_TRAMP_TEXT,
+#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
__end_of_permanent_fixed_addresses,
/*
@@ -62,6 +67,16 @@
FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
+
+ /*
+ * Used for kernel page table creation, so unmapped memory may be used
+ * for tables.
+ */
+ FIX_PTE,
+ FIX_PMD,
+ FIX_PUD,
+ FIX_PGD,
+
__end_of_fixed_addresses
};
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index c5534fa..3c60f37 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -28,6 +28,8 @@
extern unsigned long ftrace_graph_call;
+extern void return_to_handler(void);
+
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/*
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index 195fd56..5bb2fd4 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -21,15 +21,12 @@
#include <linux/futex.h>
#include <linux/uaccess.h>
-#include <asm/alternative.h>
-#include <asm/cpufeature.h>
#include <asm/errno.h>
-#include <asm/sysreg.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
+do { \
+ uaccess_enable(); \
asm volatile( \
- ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN) \
" prfm pstl1strm, %2\n" \
"1: ldxr %w1, %2\n" \
insn "\n" \
@@ -42,15 +39,13 @@
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection\n" \
-" .pushsection __ex_table,\"a\"\n" \
-" .align 3\n" \
-" .quad 1b, 4b, 2b, 4b\n" \
-" .popsection\n" \
- ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN) \
+ _ASM_EXTABLE(1b, 4b) \
+ _ASM_EXTABLE(2b, 4b) \
: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
: "r" (oparg), "Ir" (-EFAULT) \
- : "memory")
+ : "memory"); \
+ uaccess_disable(); \
+} while (0)
static inline int
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
@@ -102,8 +97,8 @@
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;
+ uaccess_enable();
asm volatile("// futex_atomic_cmpxchg_inatomic\n"
-ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
" prfm pstl1strm, %2\n"
"1: ldxr %w1, %2\n"
" sub %w3, %w1, %w4\n"
@@ -116,14 +111,12 @@
"4: mov %w0, %w6\n"
" b 3b\n"
" .popsection\n"
-" .pushsection __ex_table,\"a\"\n"
-" .align 3\n"
-" .quad 1b, 4b, 2b, 4b\n"
-" .popsection\n"
-ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
+ _ASM_EXTABLE(1b, 4b)
+ _ASM_EXTABLE(2b, 4b)
: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
: "memory");
+ uaccess_disable();
*uval = val;
return ret;
diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index a57601f..8740297 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -20,7 +20,7 @@
#include <linux/threads.h>
#include <asm/irq.h>
-#define NR_IPI 5
+#define NR_IPI 6
typedef struct {
unsigned int __softirq_pending;
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index bb4052e..bbc1e35 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -26,36 +26,7 @@
return *ptep;
}
-static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep, pte_t pte)
-{
- set_pte_at(mm, addr, ptep, pte);
-}
-static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
- unsigned long addr, pte_t *ptep)
-{
- ptep_clear_flush(vma, addr, ptep);
-}
-
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
- unsigned long addr, pte_t *ptep)
-{
- ptep_set_wrprotect(mm, addr, ptep);
-}
-
-static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
- unsigned long addr, pte_t *ptep)
-{
- return ptep_get_and_clear(mm, addr, ptep);
-}
-
-static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
- unsigned long addr, pte_t *ptep,
- pte_t pte, int dirty)
-{
- return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
-}
static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long addr, unsigned long end,
@@ -97,4 +68,19 @@
clear_bit(PG_dcache_clean, &page->flags);
}
+extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
+ struct page *page, int writable);
+#define arch_make_huge_pte arch_make_huge_pte
+extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pte);
+extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep,
+ pte_t pte, int dirty);
+extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep);
+extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep);
+extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep);
+
#endif /* __ASM_HUGETLB_H */
diff --git a/arch/arm64/include/asm/hw_breakpoint.h b/arch/arm64/include/asm/hw_breakpoint.h
index 9732908..c72b8e2 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -68,7 +68,11 @@
/* Lengths */
#define ARM_BREAKPOINT_LEN_1 0x1
#define ARM_BREAKPOINT_LEN_2 0x3
+#define ARM_BREAKPOINT_LEN_3 0x7
#define ARM_BREAKPOINT_LEN_4 0xf
+#define ARM_BREAKPOINT_LEN_5 0x1f
+#define ARM_BREAKPOINT_LEN_6 0x3f
+#define ARM_BREAKPOINT_LEN_7 0x7f
#define ARM_BREAKPOINT_LEN_8 0xff
/* Kernel stepping */
@@ -110,7 +114,7 @@
struct pmu;
extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
- int *gen_len, int *gen_type);
+ int *gen_len, int *gen_type, int *offset);
extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 8e8d3068..b77197d 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -1,10 +1,45 @@
#ifndef __ASM_IRQ_H
#define __ASM_IRQ_H
+#define IRQ_STACK_SIZE THREAD_SIZE
+#define IRQ_STACK_START_SP THREAD_START_SP
+
+#ifndef __ASSEMBLER__
+
+#include <linux/percpu.h>
+
#include <asm-generic/irq.h>
+#include <asm/thread_info.h>
struct pt_regs;
+DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
+
+/*
+ * The highest address on the stack, and the first to be used. Used to
+ * find the dummy-stack frame put down by el?_irq() in entry.S, which
+ * is structured as follows:
+ *
+ * ------------
+ * | | <- irq_stack_ptr
+ * top ------------
+ * | x19 | <- irq_stack_ptr - 0x08
+ * ------------
+ * | x29 | <- irq_stack_ptr - 0x10
+ * ------------
+ *
+ * where x19 holds a copy of the task stack pointer where the struct pt_regs
+ * from kernel_entry can be found.
+ *
+ */
+#define IRQ_STACK_PTR(cpu) ((unsigned long)per_cpu(irq_stack, cpu) + IRQ_STACK_START_SP)
+
+/*
+ * The offset from irq_stack_ptr where entry.S will store the original
+ * stack pointer. Used by unwind_frame() and dump_backtrace().
+ */
+#define IRQ_STACK_TO_TASK_STACK(ptr) (*((unsigned long *)((ptr) - 0x08)))
+
extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
static inline int nr_legacy_irqs(void)
@@ -12,4 +47,14 @@
return 0;
}
+static inline bool on_irq_stack(unsigned long sp, int cpu)
+{
+ /* variable names the same as kernel/stacktrace.c */
+ unsigned long low = (unsigned long)per_cpu(irq_stack, cpu);
+ unsigned long high = low + IRQ_STACK_START_SP;
+
+ return (low <= sp && sp <= high);
+}
+
+#endif /* !__ASSEMBLER__ */
#endif
diff --git a/arch/arm64/include/asm/kasan.h b/arch/arm64/include/asm/kasan.h
index 2774fa3..71ad0f9 100644
--- a/arch/arm64/include/asm/kasan.h
+++ b/arch/arm64/include/asm/kasan.h
@@ -7,13 +7,14 @@
#include <linux/linkage.h>
#include <asm/memory.h>
+#include <asm/pgtable-types.h>
/*
* KASAN_SHADOW_START: beginning of the kernel virtual addresses.
* KASAN_SHADOW_END: KASAN_SHADOW_START + 1/8 of kernel virtual addresses.
*/
#define KASAN_SHADOW_START (VA_START)
-#define KASAN_SHADOW_END (KASAN_SHADOW_START + (1UL << (VA_BITS - 3)))
+#define KASAN_SHADOW_END (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
/*
* This value is used to map an address to the corresponding shadow
@@ -28,10 +29,12 @@
#define KASAN_SHADOW_OFFSET (KASAN_SHADOW_END - (1ULL << (64 - 3)))
void kasan_init(void);
+void kasan_copy_shadow(pgd_t *pgdir);
asmlinkage void kasan_early_init(void);
#else
static inline void kasan_init(void) { }
+static inline void kasan_copy_shadow(pgd_t *pgdir) { }
#endif
#endif
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index a459714..77a27af 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -19,6 +19,8 @@
#ifndef __ASM_KERNEL_PGTABLE_H
#define __ASM_KERNEL_PGTABLE_H
+#include <asm/pgtable.h>
+#include <asm/sparsemem.h>
/*
* The linear mapping and the start of memory are both 2M aligned (per
@@ -53,6 +55,12 @@
#define SWAPPER_DIR_SIZE (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
#define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+#define RESERVED_TTBR0_SIZE (PAGE_SIZE)
+#else
+#define RESERVED_TTBR0_SIZE (0)
+#endif
+
/* Initial memory map size */
#if ARM64_SWAPPER_USES_SECTION_MAPS
#define SWAPPER_BLOCK_SHIFT SECTION_SHIFT
@@ -70,8 +78,16 @@
/*
* Initial memory map attributes.
*/
-#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define SWAPPER_PTE_FLAGS (_SWAPPER_PTE_FLAGS | PTE_NG)
+#define SWAPPER_PMD_FLAGS (_SWAPPER_PMD_FLAGS | PMD_SECT_NG)
+#else
+#define SWAPPER_PTE_FLAGS _SWAPPER_PTE_FLAGS
+#define SWAPPER_PMD_FLAGS _SWAPPER_PMD_FLAGS
+#endif
#if ARM64_SWAPPER_USES_SECTION_MAPS
#define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
@@ -79,5 +95,31 @@
#define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS)
#endif
+/*
+ * To make optimal use of block mappings when laying out the linear
+ * mapping, round down the base of physical memory to a size that can
+ * be mapped efficiently, i.e., either PUD_SIZE (4k granule) or PMD_SIZE
+ * (64k granule), or a multiple that can be mapped using contiguous bits
+ * in the page tables: 32 * PMD_SIZE (16k granule)
+ */
+#if defined(CONFIG_ARM64_4K_PAGES)
+#define ARM64_MEMSTART_SHIFT PUD_SHIFT
+#elif defined(CONFIG_ARM64_16K_PAGES)
+#define ARM64_MEMSTART_SHIFT (PMD_SHIFT + 5)
+#else
+#define ARM64_MEMSTART_SHIFT PMD_SHIFT
+#endif
+
+/*
+ * sparsemem vmemmap imposes an additional requirement on the alignment of
+ * memstart_addr, due to the fact that the base of the vmemmap region
+ * has a direct correspondence, and needs to appear sufficiently aligned
+ * in the virtual address space.
+ */
+#if defined(CONFIG_SPARSEMEM_VMEMMAP) && ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
+#define ARM64_MEMSTART_ALIGN (1UL << SECTION_SIZE_BITS)
+#else
+#define ARM64_MEMSTART_ALIGN (1UL << ARM64_MEMSTART_SHIFT)
+#endif
#endif /* __ASM_KERNEL_PGTABLE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 5e37710..419bc66 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -102,6 +102,8 @@
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
+#define kvm_ksym_ref(sym) phys_to_virt((u64)&sym - kimage_voffset)
+
#ifndef __ASSEMBLY__
struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a35ce72..90c6368 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -222,7 +222,7 @@
struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
-u64 kvm_call_hyp(void *hypfn, ...);
+u64 __kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
@@ -243,8 +243,8 @@
* Call initialization code, and switch to the full blown
* HYP code.
*/
- kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
- hyp_stack_ptr, vector_ptr);
+ __kvm_call_hyp((void *)boot_pgd_ptr, pgd_ptr,
+ hyp_stack_ptr, vector_ptr);
}
static inline void kvm_arch_hardware_disable(void) {}
@@ -258,4 +258,6 @@
void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+#define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+
#endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 819b21a..2b1020a 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -265,7 +265,7 @@
kvm_flush_dcache_to_poc(page_address(page), PUD_SIZE);
}
-#define kvm_virt_to_phys(x) __virt_to_phys((unsigned long)(x))
+#define kvm_virt_to_phys(x) __pa_symbol(x)
void kvm_set_way_flush(struct kvm_vcpu *vcpu);
void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled);
diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
index 3de42d6..23acc00 100644
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -26,6 +26,7 @@
/* Macro for constructing calls to out-of-line ll/sc atomics */
#define __LL_SC_CALL(op) "bl\t" __stringify(__LL_SC_PREFIX(op)) "\n"
+#define __LL_SC_CLOBBERS "x16", "x17", "x30"
/* In-line patching at runtime */
#define ARM64_LSE_ATOMIC_INSN(llsc, lse) \
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b42b930..6b17d08 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -24,6 +24,7 @@
#include <linux/compiler.h>
#include <linux/const.h>
#include <linux/types.h>
+#include <asm/bug.h>
#include <asm/sizes.h>
/*
@@ -45,17 +46,17 @@
* VA_START - the first kernel virtual address.
* TASK_SIZE - the maximum size of a user space task.
* TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
- * The module space lives between the addresses given by TASK_SIZE
- * and PAGE_OFFSET - it must be within 128MB of the kernel text.
*/
#define VA_BITS (CONFIG_ARM64_VA_BITS)
#define VA_START (UL(0xffffffffffffffff) - \
(UL(1) << VA_BITS) + 1)
#define PAGE_OFFSET (UL(0xffffffffffffffff) - \
(UL(1) << (VA_BITS - 1)) + 1)
-#define MODULES_END (PAGE_OFFSET)
-#define MODULES_VADDR (MODULES_END - SZ_64M)
-#define PCI_IO_END (MODULES_VADDR - SZ_2M)
+#define KIMAGE_VADDR (MODULES_END)
+#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
+#define MODULES_VADDR (VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VSIZE (SZ_128M)
+#define PCI_IO_END (PAGE_OFFSET - SZ_2M)
#define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE)
#define FIXADDR_TOP (PCI_IO_START - SZ_2M)
#define TASK_SIZE_64 (UL(1) << VA_BITS)
@@ -73,12 +74,27 @@
#define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 4))
/*
+ * The size of the KASAN shadow region. This should be 1/8th of the
+ * size of the entire kernel virtual address space.
+ */
+#ifdef CONFIG_KASAN
+#define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - 3))
+#else
+#define KASAN_SHADOW_SIZE (0)
+#endif
+
+/*
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
*/
-#define __virt_to_phys(x) (((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
-#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+#define __virt_to_phys(x) ({ \
+ phys_addr_t __x = (phys_addr_t)(x); \
+ __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \
+ (__x - kimage_voffset); })
+
+#define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET) | PAGE_OFFSET)
+#define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset))
/*
* Convert a page to/from a physical address
@@ -102,19 +118,45 @@
#define MT_S2_NORMAL 0xf
#define MT_S2_DEVICE_nGnRE 0x1
+#ifdef CONFIG_ARM64_4K_PAGES
+#define IOREMAP_MAX_ORDER (PUD_SHIFT)
+#else
+#define IOREMAP_MAX_ORDER (PMD_SHIFT)
+#endif
+
+#ifdef CONFIG_BLK_DEV_INITRD
+#define __early_init_dt_declare_initrd(__start, __end) \
+ do { \
+ initrd_start = (__start); \
+ initrd_end = (__end); \
+ } while (0)
+#endif
+
#ifndef __ASSEMBLY__
-extern phys_addr_t memstart_addr;
+#include <linux/bitops.h>
+#include <linux/mmdebug.h>
+
+extern s64 memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
-#define PHYS_OFFSET ({ memstart_addr; })
+#define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
+
+/* the virtual base of the kernel image (minus TEXT_OFFSET) */
+extern u64 kimage_vaddr;
+
+/* the offset between the kernel virtual and physical mappings */
+extern u64 kimage_voffset;
+
+static inline unsigned long kaslr_offset(void)
+{
+ return kimage_vaddr - KIMAGE_VADDR;
+}
/*
- * The maximum physical address that the linear direct mapping
- * of system RAM can cover. (PAGE_OFFSET can be interpreted as
- * a 2's complement signed quantity and negated to derive the
- * maximum size of the linear mapping.)
+ * Allow all memory at the discovery stage. We will clip it later.
*/
-#define MAX_MEMBLOCK_ADDR ({ memstart_addr - PAGE_OFFSET - 1; })
+#define MIN_MEMBLOCK_ADDR 0
+#define MAX_MEMBLOCK_ADDR U64_MAX
/*
* PFNs are used to describe any physical page; this means
@@ -150,6 +192,7 @@
#define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
#define virt_to_pfn(x) __phys_to_pfn(__virt_to_phys(x))
+#define sym_to_pfn(x) __phys_to_pfn(__pa_symbol(x))
/*
* virt_to_page(k) convert a _valid_ virtual address to struct page *
@@ -158,7 +201,11 @@
#define ARCH_PFN_OFFSET ((unsigned long)PHYS_PFN_OFFSET)
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
-#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
+#define _virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
+
+#define _virt_addr_is_linear(kaddr) (((u64)(kaddr)) >= PAGE_OFFSET)
+#define virt_addr_valid(kaddr) (_virt_addr_is_linear(kaddr) && \
+ _virt_addr_valid(kaddr))
#endif
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 990124a..39e502f 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -16,6 +16,11 @@
#ifndef __ASM_MMU_H
#define __ASM_MMU_H
+#define USER_ASID_FLAG (UL(1) << 48)
+#define TTBR_ASID_MASK (UL(0xffff) << 48)
+
+#ifndef __ASSEMBLY__
+
typedef struct {
atomic64_t id;
void *vdso;
@@ -28,6 +33,12 @@
*/
#define ASID(mm) ((mm)->context.id.counter & 0xffff)
+static inline bool arm64_kernel_unmapped_at_el0(void)
+{
+ return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) &&
+ cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0);
+}
+
extern void paging_init(void);
extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
extern void init_mem_pgprot(void);
@@ -36,4 +47,5 @@
pgprot_t prot);
extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
+#endif /* !__ASSEMBLY__ */
#endif
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2416578..8848c49 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -23,10 +23,12 @@
#include <linux/sched.h>
#include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
#include <asm/proc-fns.h>
#include <asm-generic/mm_hooks.h>
#include <asm/cputype.h>
#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
#ifdef CONFIG_PID_IN_CONTEXTIDR
static inline void contextidr_thread_switch(struct task_struct *next)
@@ -48,7 +50,7 @@
*/
static inline void cpu_set_reserved_ttbr0(void)
{
- unsigned long ttbr = page_to_phys(empty_zero_page);
+ unsigned long ttbr = __pa_symbol(empty_zero_page);
asm(
" msr ttbr0_el1, %0 // set TTBR0\n"
@@ -57,6 +59,13 @@
: "r" (ttbr));
}
+static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
+{
+ BUG_ON(pgd == swapper_pg_dir);
+ cpu_set_reserved_ttbr0();
+ cpu_do_switch_mm(virt_to_phys(pgd),mm);
+}
+
/*
* TCR.T0SZ value to use when the ID map is active. Usually equals
* TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in
@@ -73,7 +82,7 @@
/*
* Set TCR.T0SZ to its default value (based on VA_BITS)
*/
-static inline void cpu_set_default_tcr_t0sz(void)
+static inline void __cpu_set_tcr_t0sz(unsigned long t0sz)
{
unsigned long tcr;
@@ -86,7 +95,62 @@
" msr tcr_el1, %0 ;"
" isb"
: "=&r" (tcr)
- : "r"(TCR_T0SZ(VA_BITS)), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+ : "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+}
+
+#define cpu_set_default_tcr_t0sz() __cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS))
+#define cpu_set_idmap_tcr_t0sz() __cpu_set_tcr_t0sz(idmap_t0sz)
+
+/*
+ * Remove the idmap from TTBR0_EL1 and install the pgd of the active mm.
+ *
+ * The idmap lives in the same VA range as userspace, but uses global entries
+ * and may use a different TCR_EL1.T0SZ. To avoid issues resulting from
+ * speculative TLB fetches, we must temporarily install the reserved page
+ * tables while we invalidate the TLBs and set up the correct TCR_EL1.T0SZ.
+ *
+ * If current is a not a user task, the mm covers the TTBR1_EL1 page tables,
+ * which should not be installed in TTBR0_EL1. In this case we can leave the
+ * reserved page tables in place.
+ */
+static inline void cpu_uninstall_idmap(void)
+{
+ struct mm_struct *mm = current->active_mm;
+
+ cpu_set_reserved_ttbr0();
+ local_flush_tlb_all();
+ cpu_set_default_tcr_t0sz();
+
+ if (mm != &init_mm && !system_uses_ttbr0_pan())
+ cpu_switch_mm(mm->pgd, mm);
+}
+
+static inline void cpu_install_idmap(void)
+{
+ cpu_set_reserved_ttbr0();
+ local_flush_tlb_all();
+ cpu_set_idmap_tcr_t0sz();
+
+ cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm);
+}
+
+/*
+ * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
+ * avoiding the possibility of conflicting TLB entries being allocated.
+ */
+static inline void cpu_replace_ttbr1(pgd_t *pgd)
+{
+ typedef void (ttbr_replace_func)(phys_addr_t);
+ extern ttbr_replace_func idmap_cpu_replace_ttbr1;
+ ttbr_replace_func *replace_phys;
+
+ phys_addr_t pgd_phys = virt_to_phys(pgd);
+
+ replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1);
+
+ cpu_install_idmap();
+ replace_phys(pgd_phys);
+ cpu_uninstall_idmap();
}
/*
@@ -117,21 +181,28 @@
{
}
-/*
- * This is the actual mm switch as far as the scheduler
- * is concerned. No registers are touched. We avoid
- * calling the CPU specific function when the mm hasn't
- * actually changed.
- */
-static inline void
-switch_mm(struct mm_struct *prev, struct mm_struct *next,
- struct task_struct *tsk)
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+static inline void update_saved_ttbr0(struct task_struct *tsk,
+ struct mm_struct *mm)
+{
+ if (system_uses_ttbr0_pan()) {
+ u64 ttbr;
+ BUG_ON(mm->pgd == swapper_pg_dir);
+ ttbr = virt_to_phys(mm->pgd) | ASID(mm) << 48;
+ WRITE_ONCE(task_thread_info(tsk)->ttbr0, ttbr);
+ }
+}
+#else
+static inline void update_saved_ttbr0(struct task_struct *tsk,
+ struct mm_struct *mm)
+{
+}
+#endif
+
+static inline void __switch_mm(struct mm_struct *next)
{
unsigned int cpu = smp_processor_id();
- if (prev == next)
- return;
-
/*
* init_mm.pgd does not contain any user mappings and it is always
* active for kernel addresses in TTBR1. Just set the reserved TTBR0.
@@ -144,7 +215,27 @@
check_and_switch_context(next, cpu);
}
+static inline void
+switch_mm(struct mm_struct *prev, struct mm_struct *next,
+ struct task_struct *tsk)
+{
+ if (prev != next)
+ __switch_mm(next);
+
+ /*
+ * Update the saved TTBR0_EL1 of the scheduled-in task as the previous
+ * value may have not been initialised yet (activate_mm caller) or the
+ * ASID has changed since the last run (following the context switch
+ * of another thread of the same process). Avoid setting the reserved
+ * TTBR0_EL1 to swapper_pg_dir (init_mm; e.g. via idle_task_exit).
+ */
+ if (next != &init_mm)
+ update_saved_ttbr0(tsk, next);
+}
+
#define deactivate_mm(tsk,mm) do { } while (0)
-#define activate_mm(prev,next) switch_mm(prev, next, NULL)
+#define activate_mm(prev,next) switch_mm(prev, next, current)
+
+void post_ttbr_update_workaround(void);
#endif
diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
index e80e232..06ff7fd 100644
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -17,7 +17,29 @@
#define __ASM_MODULE_H
#include <asm-generic/module.h>
+#include <asm/memory.h>
#define MODULE_ARCH_VERMAGIC "aarch64"
+#ifdef CONFIG_ARM64_MODULE_PLTS
+struct mod_arch_specific {
+ struct elf64_shdr *plt;
+ int plt_num_entries;
+ int plt_max_entries;
+};
+#endif
+
+u64 module_emit_plt_entry(struct module *mod, const Elf64_Rela *rela,
+ Elf64_Sym *sym);
+
+#ifdef CONFIG_RANDOMIZE_BASE
+#ifdef CONFIG_MODVERSIONS
+#define ARCH_RELOCATES_KCRCTAB
+#define reloc_start (kimage_vaddr - KIMAGE_VADDR)
+#endif
+extern u64 module_alloc_base;
+#else
+#define module_alloc_base ((u64)_etext - MODULES_VSIZE)
+#endif
+
#endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
index 8a33685..2ce1a02 100644
--- a/arch/arm64/include/asm/percpu.h
+++ b/arch/arm64/include/asm/percpu.h
@@ -16,6 +16,8 @@
#ifndef __ASM_PERCPU_H
#define __ASM_PERCPU_H
+#include <asm/stack_pointer.h>
+
static inline void set_my_cpu_offset(unsigned long off)
{
asm volatile("msr tpidr_el1, %0" :: "r" (off) : "memory");
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 7bd3cdb..91b6be0 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -17,6 +17,8 @@
#ifndef __ASM_PERF_EVENT_H
#define __ASM_PERF_EVENT_H
+#include <asm/stack_pointer.h>
+
#ifdef CONFIG_PERF_EVENTS
struct pt_regs;
extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index c150539..ff98585 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -42,11 +42,20 @@
free_page((unsigned long)pmd);
}
-static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+static inline void __pud_populate(pud_t *pud, phys_addr_t pmd, pudval_t prot)
{
- set_pud(pud, __pud(__pa(pmd) | PMD_TYPE_TABLE));
+ set_pud(pud, __pud(pmd | prot));
}
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+ __pud_populate(pud, __pa(pmd), PMD_TYPE_TABLE);
+}
+#else
+static inline void __pud_populate(pud_t *pud, phys_addr_t pmd, pudval_t prot)
+{
+ BUILD_BUG();
+}
#endif /* CONFIG_PGTABLE_LEVELS > 2 */
#if CONFIG_PGTABLE_LEVELS > 3
@@ -62,11 +71,20 @@
free_page((unsigned long)pud);
}
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pud, pgdval_t prot)
{
- set_pgd(pgd, __pgd(__pa(pud) | PUD_TYPE_TABLE));
+ set_pgd(pgdp, __pgd(pud | prot));
}
+static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+{
+ __pgd_populate(pgd, __pa(pud), PUD_TYPE_TABLE);
+}
+#else
+static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pud, pgdval_t prot)
+{
+ BUILD_BUG();
+}
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
extern pgd_t *pgd_alloc(struct mm_struct *mm);
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index b9da954..d7890c0 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -90,7 +90,23 @@
/*
* Contiguous page definitions.
*/
-#define CONT_PTES (_AC(1, UL) << CONT_SHIFT)
+#ifdef CONFIG_ARM64_64K_PAGES
+#define CONT_PTE_SHIFT 5
+#define CONT_PMD_SHIFT 5
+#elif defined(CONFIG_ARM64_16K_PAGES)
+#define CONT_PTE_SHIFT 7
+#define CONT_PMD_SHIFT 5
+#else
+#define CONT_PTE_SHIFT 4
+#define CONT_PMD_SHIFT 4
+#endif
+
+#define CONT_PTES (1 << CONT_PTE_SHIFT)
+#define CONT_PTE_SIZE (CONT_PTES * PAGE_SIZE)
+#define CONT_PTE_MASK (~(CONT_PTE_SIZE - 1))
+#define CONT_PMDS (1 << CONT_PMD_SHIFT)
+#define CONT_PMD_SIZE (CONT_PMDS * PMD_SIZE)
+#define CONT_PMD_MASK (~(CONT_PMD_SIZE - 1))
/* the the numerical offset of the PTE within a range of CONT_PTES */
#define CONT_RANGE_OFFSET(addr) (((addr)>>PAGE_SHIFT)&(CONT_PTES-1))
@@ -208,6 +224,8 @@
#define TCR_TG1_16K (UL(1) << 30)
#define TCR_TG1_4K (UL(2) << 30)
#define TCR_TG1_64K (UL(3) << 30)
+
+#define TCR_A1 (UL(1) << 22)
#define TCR_ASID16 (UL(1) << 36)
#define TCR_TBI0 (UL(1) << 37)
#define TCR_HA (UL(1) << 39)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 67c2ad6..81eee57 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -36,19 +36,13 @@
*
* VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
* (rounded up to PUD_SIZE).
- * VMALLOC_START: beginning of the kernel VA space
+ * VMALLOC_START: beginning of the kernel vmalloc space
* VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
* fixed mappings and modules
*/
#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
-#ifndef CONFIG_KASAN
-#define VMALLOC_START (VA_START)
-#else
-#include <asm/kasan.h>
-#define VMALLOC_START (KASAN_SHADOW_END + SZ_64K)
-#endif
-
+#define VMALLOC_START (MODULES_END)
#define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
#define VMEMMAP_START (VMALLOC_END + SZ_64K)
@@ -59,6 +53,7 @@
#ifndef __ASSEMBLY__
+#include <asm/fixmap.h>
#include <linux/mmdebug.h>
extern void __pte_error(const char *file, int line, unsigned long val);
@@ -66,8 +61,16 @@
extern void __pud_error(const char *file, int line, unsigned long val);
extern void __pgd_error(const char *file, int line, unsigned long val);
-#define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define PROT_DEFAULT (_PROT_DEFAULT | PTE_NG)
+#define PROT_SECT_DEFAULT (_PROT_SECT_DEFAULT | PMD_SECT_NG)
+#else
+#define PROT_DEFAULT _PROT_DEFAULT
+#define PROT_SECT_DEFAULT _PROT_SECT_DEFAULT
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
#define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE))
#define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE))
@@ -80,6 +83,7 @@
#define PROT_SECT_NORMAL_EXEC (PROT_SECT_DEFAULT | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
#define _PAGE_DEFAULT (PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL))
+#define _HYP_PAGE_DEFAULT (_PAGE_DEFAULT & ~PTE_NG)
#define PAGE_KERNEL __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE)
#define PAGE_KERNEL_RO __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_RDONLY)
@@ -87,13 +91,13 @@
#define PAGE_KERNEL_EXEC __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE)
#define PAGE_KERNEL_EXEC_CONT __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_CONT)
-#define PAGE_HYP __pgprot(_PAGE_DEFAULT | PTE_HYP)
+#define PAGE_HYP __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP)
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
#define PAGE_S2 __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
#define PAGE_S2_DEVICE __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
-#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_PXN | PTE_UXN)
+#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_NG | PTE_PXN | PTE_UXN)
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
#define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_WRITE)
#define PAGE_COPY __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN)
@@ -123,8 +127,8 @@
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-extern struct page *empty_zero_page;
-#define ZERO_PAGE(vaddr) (empty_zero_page)
+extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
+#define ZERO_PAGE(vaddr) phys_to_page(__pa_symbol(empty_zero_page))
#define pte_ERROR(pte) __pte_error(__FILE__, __LINE__, pte_val(pte))
@@ -136,16 +140,6 @@
#define pte_clear(mm,addr,ptep) set_pte(ptep, __pte(0))
#define pte_page(pte) (pfn_to_page(pte_pfn(pte)))
-/* Find an entry in the third-level page table. */
-#define pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
-
-#define pte_offset_kernel(dir,addr) (pmd_page_vaddr(*(dir)) + pte_index(addr))
-
-#define pte_offset_map(dir,addr) pte_offset_kernel((dir), (addr))
-#define pte_offset_map_nested(dir,addr) pte_offset_kernel((dir), (addr))
-#define pte_unmap(pte) do { } while (0)
-#define pte_unmap_nested(pte) do { } while (0)
-
/*
* The following only work if pte_present(). Undefined behaviour otherwise.
*/
@@ -168,6 +162,16 @@
#define pte_valid(pte) (!!(pte_val(pte) & PTE_VALID))
#define pte_valid_not_user(pte) \
((pte_val(pte) & (PTE_VALID | PTE_USER)) == PTE_VALID)
+#define pte_valid_young(pte) \
+ ((pte_val(pte) & (PTE_VALID | PTE_AF)) == (PTE_VALID | PTE_AF))
+
+/*
+ * Could the pte be present in the TLB? We must check mm_tlb_flush_pending
+ * so that we don't erroneously return false for pages that have been
+ * remapped as PROT_NONE but are yet to be flushed from the TLB.
+ */
+#define pte_accessible(mm, pte) \
+ (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte))
static inline pte_t clear_pte_bit(pte_t pte, pgprot_t prot)
{
@@ -218,7 +222,8 @@
static inline pte_t pte_mkcont(pte_t pte)
{
- return set_pte_bit(pte, __pgprot(PTE_CONT));
+ pte = set_pte_bit(pte, __pgprot(PTE_CONT));
+ return set_pte_bit(pte, __pgprot(PTE_TYPE_PAGE));
}
static inline pte_t pte_mknoncont(pte_t pte)
@@ -226,6 +231,11 @@
return clear_pte_bit(pte, __pgprot(PTE_CONT));
}
+static inline pmd_t pmd_mkcont(pmd_t pmd)
+{
+ return __pmd(pmd_val(pmd) | PMD_SECT_CONT);
+}
+
static inline void set_pte(pte_t *ptep, pte_t pte)
{
*ptep = pte;
@@ -299,7 +309,7 @@
/*
* Hugetlb definitions.
*/
-#define HUGE_MAX_HSTATE 2
+#define HUGE_MAX_HSTATE 4
#define HPAGE_SHIFT PMD_SHIFT
#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
#define HPAGE_MASK (~(HPAGE_SIZE - 1))
@@ -425,13 +435,31 @@
set_pmd(pmdp, __pmd(0));
}
-static inline pte_t *pmd_page_vaddr(pmd_t pmd)
+static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
{
- return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
+ return pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK;
}
+/* Find an entry in the third-level page table. */
+#define pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+
+#define pte_offset_phys(dir,addr) (pmd_page_paddr(*(dir)) + pte_index(addr) * sizeof(pte_t))
+#define pte_offset_kernel(dir,addr) ((pte_t *)__va(pte_offset_phys((dir), (addr))))
+
+#define pte_offset_map(dir,addr) pte_offset_kernel((dir), (addr))
+#define pte_offset_map_nested(dir,addr) pte_offset_kernel((dir), (addr))
+#define pte_unmap(pte) do { } while (0)
+#define pte_unmap_nested(pte) do { } while (0)
+
+#define pte_set_fixmap(addr) ((pte_t *)set_fixmap_offset(FIX_PTE, addr))
+#define pte_set_fixmap_offset(pmd, addr) pte_set_fixmap(pte_offset_phys(pmd, addr))
+#define pte_clear_fixmap() clear_fixmap(FIX_PTE)
+
#define pmd_page(pmd) pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pte_offset_kimg(dir,addr) ((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
+
/*
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
@@ -458,21 +486,37 @@
set_pud(pudp, __pud(0));
}
-static inline pmd_t *pud_page_vaddr(pud_t pud)
+static inline phys_addr_t pud_page_paddr(pud_t pud)
{
- return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
+ return pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK;
}
/* Find an entry in the second-level page table. */
#define pmd_index(addr) (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
-static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
-{
- return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
-}
+#define pmd_offset_phys(dir, addr) (pud_page_paddr(*(dir)) + pmd_index(addr) * sizeof(pmd_t))
+#define pmd_offset(dir, addr) ((pmd_t *)__va(pmd_offset_phys((dir), (addr))))
+
+#define pmd_set_fixmap(addr) ((pmd_t *)set_fixmap_offset(FIX_PMD, addr))
+#define pmd_set_fixmap_offset(pud, addr) pmd_set_fixmap(pmd_offset_phys(pud, addr))
+#define pmd_clear_fixmap() clear_fixmap(FIX_PMD)
#define pud_page(pud) pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
+
+#else
+
+#define pud_page_paddr(pud) ({ BUILD_BUG(); 0; })
+
+/* Match pmd_offset folding in <asm/generic/pgtable-nopmd.h> */
+#define pmd_set_fixmap(addr) NULL
+#define pmd_set_fixmap_offset(pudp, addr) ((pmd_t *)pudp)
+#define pmd_clear_fixmap()
+
+#define pmd_offset_kimg(dir,addr) ((pmd_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 2 */
#if CONFIG_PGTABLE_LEVELS > 3
@@ -494,21 +538,37 @@
set_pgd(pgdp, __pgd(0));
}
-static inline pud_t *pgd_page_vaddr(pgd_t pgd)
+static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
{
- return __va(pgd_val(pgd) & PHYS_MASK & (s32)PAGE_MASK);
+ return pgd_val(pgd) & PHYS_MASK & (s32)PAGE_MASK;
}
/* Find an entry in the frst-level page table. */
#define pud_index(addr) (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
-static inline pud_t *pud_offset(pgd_t *pgd, unsigned long addr)
-{
- return (pud_t *)pgd_page_vaddr(*pgd) + pud_index(addr);
-}
+#define pud_offset_phys(dir, addr) (pgd_page_paddr(*(dir)) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset(dir, addr) ((pud_t *)__va(pud_offset_phys((dir), (addr))))
+
+#define pud_set_fixmap(addr) ((pud_t *)set_fixmap_offset(FIX_PUD, addr))
+#define pud_set_fixmap_offset(pgd, addr) pud_set_fixmap(pud_offset_phys(pgd, addr))
+#define pud_clear_fixmap() clear_fixmap(FIX_PUD)
#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
+/* use ONLY for statically allocated translation tables */
+#define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
+
+#else
+
+#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
+
+/* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
+#define pud_set_fixmap(addr) NULL
+#define pud_set_fixmap_offset(pgdp, addr) ((pud_t *)pgdp)
+#define pud_clear_fixmap()
+
+#define pud_offset_kimg(dir,addr) ((pud_t *)dir)
+
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
#define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd_val(pgd))
@@ -516,11 +576,16 @@
/* to find an entry in a page-table-directory */
#define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
-#define pgd_offset(mm, addr) ((mm)->pgd+pgd_index(addr))
+#define pgd_offset_raw(pgd, addr) ((pgd) + pgd_index(addr))
+
+#define pgd_offset(mm, addr) (pgd_offset_raw((mm)->pgd, (addr)))
/* to find an entry in a kernel page-table-directory */
#define pgd_offset_k(addr) pgd_offset(&init_mm, addr)
+#define pgd_set_fixmap(addr) ((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
+#define pgd_clear_fixmap() clear_fixmap(FIX_PGD)
+
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
@@ -649,6 +714,7 @@
extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
/*
* Encode and decode a swap entry:
@@ -681,7 +747,8 @@
#include <asm-generic/pgtable.h>
-#define pgtable_cache_init() do { } while (0)
+void pgd_cache_init(void);
+#define pgtable_cache_init pgd_cache_init
/*
* On AArch64, the cache coherency is handled via the set_pte_at() function.
diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h
index 14ad6e4e..16cef2e 100644
--- a/arch/arm64/include/asm/proc-fns.h
+++ b/arch/arm64/include/asm/proc-fns.h
@@ -35,12 +35,6 @@
#include <asm/memory.h>
-#define cpu_switch_mm(pgd,mm) \
-do { \
- BUG_ON(pgd == swapper_pg_dir); \
- cpu_do_switch_mm(virt_to_phys(pgd),mm); \
-} while (0)
-
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* __ASM_PROCFNS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index d085595..4be934f 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -29,8 +29,10 @@
#include <linux/string.h>
+#include <asm/alternative.h>
#include <asm/fpsimd.h>
#include <asm/hw_breakpoint.h>
+#include <asm/lse.h>
#include <asm/pgtable-hwdef.h>
#include <asm/ptrace.h>
#include <asm/types.h>
@@ -177,9 +179,11 @@
}
#define ARCH_HAS_SPINLOCK_PREFETCH
-static inline void spin_lock_prefetch(const void *x)
+static inline void spin_lock_prefetch(const void *ptr)
{
- prefetchw(x);
+ asm volatile(ARM64_LSE_ATOMIC_INSN(
+ "prfm pstl1strm, %a0",
+ "nop") : : "p" (ptr));
}
#define HAVE_ARCH_PICK_MMAP_LAYOUT
@@ -187,5 +191,6 @@
#endif
int cpu_enable_pan(void *__unused);
+int cpu_enable_uao(void *__unused);
#endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/shmparam.h b/arch/arm64/include/asm/shmparam.h
index 4df608a..e368a55 100644
--- a/arch/arm64/include/asm/shmparam.h
+++ b/arch/arm64/include/asm/shmparam.h
@@ -21,7 +21,7 @@
* alignment value. Since we don't have aliasing D-caches, the rest of
* the time we can safely use PAGE_SIZE.
*/
-#define COMPAT_SHMLBA 0x4000
+#define COMPAT_SHMLBA (4 * PAGE_SIZE)
#include <asm-generic/shmparam.h>
diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
index eeaa975..81abea0 100644
--- a/arch/arm64/include/asm/signal32.h
+++ b/arch/arm64/include/asm/signal32.h
@@ -22,8 +22,6 @@
#define AARCH32_KERN_SIGRET_CODE_OFFSET 0x500
-extern const compat_ulong_t aarch32_sigret_code[6];
-
int compat_setup_frame(int usig, struct ksignal *ksig, sigset_t *set,
struct pt_regs *regs);
int compat_setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index d9c3d6a..4325b36 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -16,11 +16,22 @@
#ifndef __ASM_SMP_H
#define __ASM_SMP_H
+#include <asm/percpu.h>
+
#include <linux/threads.h>
#include <linux/cpumask.h>
#include <linux/thread_info.h>
-#define raw_smp_processor_id() (current_thread_info()->cpu)
+DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
+
+/*
+ * We don't use this_cpu_read(cpu_number) as that has implicit writes to
+ * preempt_count, and associated (compiler) barriers, that we'd like to avoid
+ * the expense of. If we're preemptible, the value can be stale at use anyway.
+ * And we can't use this_cpu_ptr() either, as that winds up recursing back
+ * here under CONFIG_DEBUG_PREEMPT=y.
+ */
+#define raw_smp_processor_id() (*raw_cpu_ptr(&cpu_number))
struct seq_file;
@@ -57,6 +68,9 @@
*/
struct secondary_data {
void *stack;
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ struct task_struct *task;
+#endif
};
extern struct secondary_data secondary_data;
extern void secondary_entry(void);
@@ -64,6 +78,15 @@
extern void arch_send_call_function_single_ipi(int cpu);
extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
+#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
+extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+#else
+static inline void arch_send_wakeup_ipi_mask(const struct cpumask *mask)
+{
+ BUILD_BUG();
+}
+#endif
+
extern int __cpu_disable(void);
extern void __cpu_die(unsigned int cpu);
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index fbbd7fb..fc037c3 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -26,9 +26,32 @@
* The memory barriers are implicit with the load-acquire and store-release
* instructions.
*/
+static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
+{
+ unsigned int tmp;
+ arch_spinlock_t lockval;
-#define arch_spin_unlock_wait(lock) \
- do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0)
+ asm volatile(
+" sevl\n"
+"1: wfe\n"
+"2: ldaxr %w0, %2\n"
+" eor %w1, %w0, %w0, ror #16\n"
+" cbnz %w1, 1b\n"
+ /* Serialise against any concurrent lockers */
+ ARM64_LSE_ATOMIC_INSN(
+ /* LL/SC */
+" stxr %w1, %w0, %2\n"
+" nop\n"
+" nop\n",
+ /* LSE atomics */
+" mov %w1, %w0\n"
+" cas %w0, %w0, %2\n"
+" eor %w1, %w1, %w0\n")
+" cbnz %w1, 2b\n"
+ : "=&r" (lockval), "=&r" (tmp), "+Q" (*lock)
+ :
+ : "memory");
+}
#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
diff --git a/arch/arm64/include/asm/stack_pointer.h b/arch/arm64/include/asm/stack_pointer.h
new file mode 100644
index 0000000..ffcdf74
--- /dev/null
+++ b/arch/arm64/include/asm/stack_pointer.h
@@ -0,0 +1,9 @@
+#ifndef __ASM_STACK_POINTER_H
+#define __ASM_STACK_POINTER_H
+
+/*
+ * how to get the current stack pointer from C
+ */
+register unsigned long current_stack_pointer asm ("sp");
+
+#endif /* __ASM_STACK_POINTER_H */
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 7318f6d..801a16db 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -16,14 +16,19 @@
#ifndef __ASM_STACKTRACE_H
#define __ASM_STACKTRACE_H
+struct task_struct;
+
struct stackframe {
unsigned long fp;
unsigned long sp;
unsigned long pc;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ unsigned int graph;
+#endif
};
-extern int unwind_frame(struct stackframe *frame);
-extern void walk_stackframe(struct stackframe *frame,
+extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
+extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data);
#endif /* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1..4d19a03 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
#ifndef __ASM_SUSPEND_H
#define __ASM_SUSPEND_H
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 13
/*
* struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index d48ab5b..1a78d6e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -20,6 +20,8 @@
#ifndef __ASM_SYSREG_H
#define __ASM_SYSREG_H
+#include <linux/stringify.h>
+
#include <asm/opcodes.h>
/*
@@ -70,15 +72,19 @@
#define SYS_ID_AA64MMFR0_EL1 sys_reg(3, 0, 0, 7, 0)
#define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1)
+#define SYS_ID_AA64MMFR2_EL1 sys_reg(3, 0, 0, 7, 2)
#define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0)
#define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1)
#define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7)
#define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4)
+#define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3)
#define SET_PSTATE_PAN(x) __inst_arm(0xd5000000 | REG_PSTATE_PAN_IMM |\
(!!x)<<8 | 0x1f)
+#define SET_PSTATE_UAO(x) __inst_arm(0xd5000000 | REG_PSTATE_UAO_IMM |\
+ (!!x)<<8 | 0x1f)
/* SCTLR_EL1 */
#define SCTLR_EL1_CP15BEN (0x1 << 5)
@@ -135,6 +141,9 @@
#define ID_AA64MMFR1_VMIDBITS_SHIFT 4
#define ID_AA64MMFR1_HADBS_SHIFT 0
+/* id_aa64mmfr2 */
+#define ID_AA64MMFR2_UAO_SHIFT 4
+
/* id_aa64dfr0 */
#define ID_AA64DFR0_CTX_CMPS_SHIFT 28
#define ID_AA64DFR0_WRPS_SHIFT 20
@@ -194,32 +203,34 @@
#ifdef __ASSEMBLY__
.irp num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
- .equ __reg_num_x\num, \num
+ .equ .L__reg_num_x\num, \num
.endr
- .equ __reg_num_xzr, 31
+ .equ .L__reg_num_xzr, 31
.macro mrs_s, rt, sreg
- .inst 0xd5200000|(\sreg)|(__reg_num_\rt)
+ .inst 0xd5200000|(\sreg)|(.L__reg_num_\rt)
.endm
.macro msr_s, sreg, rt
- .inst 0xd5000000|(\sreg)|(__reg_num_\rt)
+ .inst 0xd5000000|(\sreg)|(.L__reg_num_\rt)
.endm
#else
+#include <linux/types.h>
+
asm(
" .irp num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
-" .equ __reg_num_x\\num, \\num\n"
+" .equ .L__reg_num_x\\num, \\num\n"
" .endr\n"
-" .equ __reg_num_xzr, 31\n"
+" .equ .L__reg_num_xzr, 31\n"
"\n"
" .macro mrs_s, rt, sreg\n"
-" .inst 0xd5200000|(\\sreg)|(__reg_num_\\rt)\n"
+" .inst 0xd5200000|(\\sreg)|(.L__reg_num_\\rt)\n"
" .endm\n"
"\n"
" .macro msr_s, sreg, rt\n"
-" .inst 0xd5000000|(\\sreg)|(__reg_num_\\rt)\n"
+" .inst 0xd5000000|(\\sreg)|(.L__reg_num_\\rt)\n"
" .endm\n"
);
@@ -232,6 +243,23 @@
val |= set;
asm volatile("msr sctlr_el1, %0" : : "r" (val));
}
+
+/*
+ * Unlike read_cpuid, calls to read_sysreg are never expected to be
+ * optimized away or replaced with synthetic values.
+ */
+#define read_sysreg(r) ({ \
+ u64 __val; \
+ asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+ __val; \
+})
+
+#define write_sysreg(v, r) do { \
+ u64 __val = (u64)v; \
+ asm volatile("msr " __stringify(r) ", %0" \
+ : : "r" (__val)); \
+} while (0)
+
#endif
#endif /* __ASM_SYSREG_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 90c7ff2..67dd228 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -36,22 +36,36 @@
struct task_struct;
+#include <asm/stack_pointer.h>
#include <asm/types.h>
typedef unsigned long mm_segment_t;
/*
* low level task data that entry.S needs immediate access to.
- * __switch_to() assumes cpu_context follows immediately after cpu_domain.
*/
struct thread_info {
unsigned long flags; /* low level flags */
mm_segment_t addr_limit; /* address limit */
+#ifndef CONFIG_THREAD_INFO_IN_TASK
struct task_struct *task; /* main task structure */
+#endif
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ u64 ttbr0; /* saved TTBR0_EL1 */
+#endif
int preempt_count; /* 0 => preemptable, <0 => bug */
+#ifndef CONFIG_THREAD_INFO_IN_TASK
int cpu; /* cpu */
+#endif
};
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+#define INIT_THREAD_INFO(tsk) \
+{ \
+ .preempt_count = INIT_PREEMPT_COUNT, \
+ .addr_limit = KERNEL_DS, \
+}
+#else
#define INIT_THREAD_INFO(tsk) \
{ \
.task = &tsk, \
@@ -60,25 +74,28 @@
.addr_limit = KERNEL_DS, \
}
-#define init_thread_info (init_thread_union.thread_info)
-#define init_stack (init_thread_union.stack)
-
-/*
- * how to get the current stack pointer from C
- */
-register unsigned long current_stack_pointer asm ("sp");
-
/*
* how to get the thread information struct from C
*/
static inline struct thread_info *current_thread_info(void) __attribute_const__;
+/*
+ * struct thread_info can be accessed directly via sp_el0.
+ */
static inline struct thread_info *current_thread_info(void)
{
- return (struct thread_info *)
- (current_stack_pointer & ~(THREAD_SIZE - 1));
+ unsigned long sp_el0;
+
+ asm ("mrs %0, sp_el0" : "=r" (sp_el0));
+
+ return (struct thread_info *)sp_el0;
}
+#define init_thread_info (init_thread_union.thread_info)
+#endif
+
+#define init_stack (init_thread_union.stack)
+
#define thread_saved_pc(tsk) \
((unsigned long)(tsk->thread.cpu_context.pc))
#define thread_saved_sp(tsk) \
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index b460ae2..ad6bd8b 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -23,6 +23,30 @@
#include <linux/sched.h>
#include <asm/cputype.h>
+#include <asm/mmu.h>
+
+/*
+ * Raw TLBI operations.
+ *
+ * Where necessary, use the __tlbi() macro to avoid asm()
+ * boilerplate. Drivers and most kernel code should use the TLB
+ * management routines in preference to the macro below.
+ *
+ * The macro can be used as __tlbi(op) or __tlbi(op, arg), depending
+ * on whether a particular TLBI operation takes an argument or
+ * not. The macros handles invoking the asm with or without the
+ * register argument as appropriate.
+ */
+#define __TLBI_0(op, arg) asm ("tlbi " #op)
+#define __TLBI_1(op, arg) asm ("tlbi " #op ", %0" : : "r" (arg))
+#define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+
+#define __tlbi(op, ...) __TLBI_N(op, ##__VA_ARGS__, 1, 0)
+
+#define __tlbi_user(op, arg) do { \
+ if (arm64_kernel_unmapped_at_el0()) \
+ __tlbi(op, (arg) | USER_ASID_FLAG); \
+} while (0)
/*
* TLB Management
@@ -66,7 +90,7 @@
static inline void local_flush_tlb_all(void)
{
dsb(nshst);
- asm("tlbi vmalle1");
+ __tlbi(vmalle1);
dsb(nsh);
isb();
}
@@ -74,7 +98,7 @@
static inline void flush_tlb_all(void)
{
dsb(ishst);
- asm("tlbi vmalle1is");
+ __tlbi(vmalle1is);
dsb(ish);
isb();
}
@@ -84,7 +108,8 @@
unsigned long asid = ASID(mm) << 48;
dsb(ishst);
- asm("tlbi aside1is, %0" : : "r" (asid));
+ __tlbi(aside1is, asid);
+ __tlbi_user(aside1is, asid);
dsb(ish);
}
@@ -94,7 +119,8 @@
unsigned long addr = uaddr >> 12 | (ASID(vma->vm_mm) << 48);
dsb(ishst);
- asm("tlbi vale1is, %0" : : "r" (addr));
+ __tlbi(vale1is, addr);
+ __tlbi_user(vale1is, addr);
dsb(ish);
}
@@ -121,10 +147,13 @@
dsb(ishst);
for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) {
- if (last_level)
- asm("tlbi vale1is, %0" : : "r"(addr));
- else
- asm("tlbi vae1is, %0" : : "r"(addr));
+ if (last_level) {
+ __tlbi(vale1is, addr);
+ __tlbi_user(vale1is, addr);
+ } else {
+ __tlbi(vae1is, addr);
+ __tlbi_user(vae1is, addr);
+ }
}
dsb(ish);
}
@@ -149,7 +178,7 @@
dsb(ishst);
for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
- asm("tlbi vaae1is, %0" : : "r"(addr));
+ __tlbi(vaae1is, addr);
dsb(ish);
isb();
}
@@ -163,7 +192,8 @@
{
unsigned long addr = uaddr >> 12 | (ASID(mm) << 48);
- asm("tlbi vae1is, %0" : : "r" (addr));
+ __tlbi(vae1is, addr);
+ __tlbi_user(vae1is, addr);
dsb(ish);
}
diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index a3e9d6f..bbd362c 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -22,6 +22,15 @@
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
+struct sched_domain;
+#ifdef CONFIG_CPU_FREQ
+#define arch_scale_freq_capacity cpufreq_scale_freq_capacity
+extern unsigned long cpufreq_scale_freq_capacity(struct sched_domain *sd, int cpu);
+extern unsigned long cpufreq_scale_max_freq_capacity(int cpu);
+#endif
+#define arch_scale_cpu_capacity scale_cpu_capacity
+extern unsigned long scale_cpu_capacity(struct sched_domain *sd, int cpu);
+
#include <asm-generic/topology.h>
#endif /* _ASM_ARM_TOPOLOGY_H */
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index 0cc2f29..8f05d3b 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -34,7 +34,6 @@
void register_undef_hook(struct undef_hook *hook);
void unregister_undef_hook(struct undef_hook *hook);
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static inline int __in_irqentry_text(unsigned long ptr)
{
extern char __irqentry_text_start[];
@@ -43,12 +42,6 @@
return ptr >= (unsigned long)&__irqentry_text_start &&
ptr < (unsigned long)&__irqentry_text_end;
}
-#else
-static inline int __in_irqentry_text(unsigned long ptr)
-{
- return 0;
-}
-#endif
static inline int in_exception_text(unsigned long ptr)
{
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 829fa6d..d39d8bd 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -18,6 +18,13 @@
#ifndef __ASM_UACCESS_H
#define __ASM_UACCESS_H
+#include <asm/alternative.h>
+#include <asm/kernel-pgtable.h>
+#include <asm/mmu.h>
+#include <asm/sysreg.h>
+
+#ifndef __ASSEMBLY__
+
/*
* User space memory access functions
*/
@@ -25,10 +32,8 @@
#include <linux/string.h>
#include <linux/thread_info.h>
-#include <asm/alternative.h>
#include <asm/cpufeature.h>
#include <asm/ptrace.h>
-#include <asm/sysreg.h>
#include <asm/errno.h>
#include <asm/memory.h>
#include <asm/compiler.h>
@@ -37,11 +42,11 @@
#define VERIFY_WRITE 1
/*
- * The exception table consists of pairs of addresses: the first is the
- * address of an instruction that is allowed to fault, and the second is
- * the address at which the program should continue. No registers are
- * modified, so it is entirely up to the continuation code to figure out
- * what to do.
+ * The exception table consists of pairs of relative offsets: the first
+ * is the relative offset to an instruction that is allowed to fault,
+ * and the second is the relative offset at which the program should
+ * continue. No registers are modified, so it is entirely up to the
+ * continuation code to figure out what to do.
*
* All the routines below use bits of fixup code that are out of line
* with the main instruction path. This means when everything is well,
@@ -51,9 +56,11 @@
struct exception_table_entry
{
- unsigned long insn, fixup;
+ int insn, fixup;
};
+#define ARCH_HAS_RELATIVE_EXTABLE
+
extern int fixup_exception(struct pt_regs *regs);
#define KERNEL_DS (-1UL)
@@ -65,6 +72,16 @@
static inline void set_fs(mm_segment_t fs)
{
current_thread_info()->addr_limit = fs;
+
+ /*
+ * Enable/disable UAO so that copy_to_user() etc can access
+ * kernel memory with the unprivileged instructions.
+ */
+ if (IS_ENABLED(CONFIG_ARM64_UAO) && fs == KERNEL_DS)
+ asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1), ARM64_HAS_UAO));
+ else
+ asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0), ARM64_HAS_UAO,
+ CONFIG_ARM64_UAO));
}
#define segment_eq(a, b) ((a) == (b))
@@ -114,6 +131,121 @@
#define access_ok(type, addr, size) __range_ok(addr, size)
#define user_addr_max get_fs
+#define _ASM_EXTABLE(from, to) \
+ " .pushsection __ex_table, \"a\"\n" \
+ " .align 3\n" \
+ " .long (" #from " - .), (" #to " - .)\n" \
+ " .popsection\n"
+
+/*
+ * User access enabling/disabling.
+ */
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+static inline void __uaccess_ttbr0_disable(void)
+{
+ unsigned long flags, ttbr;
+
+ local_irq_save(flags);
+ ttbr = read_sysreg(ttbr1_el1);
+ ttbr &= ~TTBR_ASID_MASK;
+ /* reserved_ttbr0 placed at the end of swapper_pg_dir */
+ write_sysreg(ttbr + SWAPPER_DIR_SIZE, ttbr0_el1);
+ isb();
+ /* Set reserved ASID */
+ write_sysreg(ttbr, ttbr1_el1);
+ isb();
+ local_irq_restore(flags);
+}
+
+static inline void __uaccess_ttbr0_enable(void)
+{
+ unsigned long flags, ttbr0, ttbr1;
+
+ /*
+ * Disable interrupts to avoid preemption between reading the 'ttbr0'
+ * variable and the MSR. A context switch could trigger an ASID
+ * roll-over and an update of 'ttbr0'.
+ */
+ local_irq_save(flags);
+ ttbr0 = READ_ONCE(current_thread_info()->ttbr0);
+
+ /* Restore active ASID */
+ ttbr1 = read_sysreg(ttbr1_el1);
+ ttbr1 &= ~TTBR_ASID_MASK; /* safety measure */
+ ttbr1 |= ttbr0 & TTBR_ASID_MASK;
+ write_sysreg(ttbr1, ttbr1_el1);
+ isb();
+
+ /* Restore user page table */
+ write_sysreg(ttbr0, ttbr0_el1);
+ isb();
+ local_irq_restore(flags);
+}
+
+static inline bool uaccess_ttbr0_disable(void)
+{
+ if (!system_uses_ttbr0_pan())
+ return false;
+ __uaccess_ttbr0_disable();
+ return true;
+}
+
+static inline bool uaccess_ttbr0_enable(void)
+{
+ if (!system_uses_ttbr0_pan())
+ return false;
+ __uaccess_ttbr0_enable();
+ return true;
+}
+#else
+static inline bool uaccess_ttbr0_disable(void)
+{
+ return false;
+}
+
+static inline bool uaccess_ttbr0_enable(void)
+{
+ return false;
+}
+#endif
+
+#define __uaccess_disable(alt) \
+do { \
+ if (!uaccess_ttbr0_disable()) \
+ asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), alt, \
+ CONFIG_ARM64_PAN)); \
+} while (0)
+
+#define __uaccess_enable(alt) \
+do { \
+ if (!uaccess_ttbr0_enable()) \
+ asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), alt, \
+ CONFIG_ARM64_PAN)); \
+} while (0)
+
+static inline void uaccess_disable(void)
+{
+ __uaccess_disable(ARM64_HAS_PAN);
+}
+
+static inline void uaccess_enable(void)
+{
+ __uaccess_enable(ARM64_HAS_PAN);
+}
+
+/*
+ * These functions are no-ops when UAO is present.
+ */
+static inline void uaccess_disable_not_uao(void)
+{
+ __uaccess_disable(ARM64_ALT_PAN_NOT_UAO);
+}
+
+static inline void uaccess_enable_not_uao(void)
+{
+ __uaccess_enable(ARM64_ALT_PAN_NOT_UAO);
+}
+
/*
* The "__xxx" versions of the user access functions do not verify the address
* space - it must have been done previously with a separate "access_ok()"
@@ -122,9 +254,10 @@
* The "__xxx_error" versions set the third argument to -EFAULT if an error
* occurs, and leave it unchanged on success.
*/
-#define __get_user_asm(instr, reg, x, addr, err) \
+#define __get_user_asm(instr, alt_instr, reg, x, addr, err, feature) \
asm volatile( \
- "1: " instr " " reg "1, [%2]\n" \
+ "1:"ALTERNATIVE(instr " " reg "1, [%2]\n", \
+ alt_instr " " reg "1, [%2]\n", feature) \
"2:\n" \
" .section .fixup, \"ax\"\n" \
" .align 2\n" \
@@ -132,10 +265,7 @@
" mov %1, #0\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err), "=&r" (x) \
: "r" (addr), "i" (-EFAULT))
@@ -143,27 +273,29 @@
do { \
unsigned long __gu_val; \
__chk_user_ptr(ptr); \
- asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN)); \
+ uaccess_enable_not_uao(); \
switch (sizeof(*(ptr))) { \
case 1: \
- __get_user_asm("ldrb", "%w", __gu_val, (ptr), (err)); \
+ __get_user_asm("ldrb", "ldtrb", "%w", __gu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 2: \
- __get_user_asm("ldrh", "%w", __gu_val, (ptr), (err)); \
+ __get_user_asm("ldrh", "ldtrh", "%w", __gu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 4: \
- __get_user_asm("ldr", "%w", __gu_val, (ptr), (err)); \
+ __get_user_asm("ldr", "ldtr", "%w", __gu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 8: \
- __get_user_asm("ldr", "%", __gu_val, (ptr), (err)); \
+ __get_user_asm("ldr", "ldtr", "%", __gu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
default: \
BUILD_BUG(); \
} \
+ uaccess_disable_not_uao(); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
- asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN)); \
} while (0)
#define __get_user(x, ptr) \
@@ -190,19 +322,17 @@
((x) = 0, -EFAULT); \
})
-#define __put_user_asm(instr, reg, x, addr, err) \
+#define __put_user_asm(instr, alt_instr, reg, x, addr, err, feature) \
asm volatile( \
- "1: " instr " " reg "1, [%2]\n" \
+ "1:"ALTERNATIVE(instr " " reg "1, [%2]\n", \
+ alt_instr " " reg "1, [%2]\n", feature) \
"2:\n" \
" .section .fixup,\"ax\"\n" \
" .align 2\n" \
"3: mov %w0, %3\n" \
" b 2b\n" \
" .previous\n" \
- " .section __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 1b, 3b\n" \
- " .previous" \
+ _ASM_EXTABLE(1b, 3b) \
: "+r" (err) \
: "r" (x), "r" (addr), "i" (-EFAULT))
@@ -210,26 +340,28 @@
do { \
__typeof__(*(ptr)) __pu_val = (x); \
__chk_user_ptr(ptr); \
- asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN)); \
+ uaccess_enable_not_uao(); \
switch (sizeof(*(ptr))) { \
case 1: \
- __put_user_asm("strb", "%w", __pu_val, (ptr), (err)); \
+ __put_user_asm("strb", "sttrb", "%w", __pu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 2: \
- __put_user_asm("strh", "%w", __pu_val, (ptr), (err)); \
+ __put_user_asm("strh", "sttrh", "%w", __pu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 4: \
- __put_user_asm("str", "%w", __pu_val, (ptr), (err)); \
+ __put_user_asm("str", "sttr", "%w", __pu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
case 8: \
- __put_user_asm("str", "%", __pu_val, (ptr), (err)); \
+ __put_user_asm("str", "sttr", "%", __pu_val, (ptr), \
+ (err), ARM64_HAS_UAO); \
break; \
default: \
BUILD_BUG(); \
} \
- asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN)); \
+ uaccess_disable_not_uao(); \
} while (0)
#define __put_user(x, ptr) \
@@ -256,24 +388,39 @@
-EFAULT; \
})
-extern unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n);
-extern unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n);
+extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n);
+extern unsigned long __must_check __arch_copy_to_user(void __user *to, const void *from, unsigned long n);
extern unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n);
extern unsigned long __must_check __clear_user(void __user *addr, unsigned long n);
+static inline unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+ check_object_size(to, n, false);
+ return __arch_copy_from_user(to, from, n);
+}
+
+static inline unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+ check_object_size(from, n, true);
+ return __arch_copy_to_user(to, from, n);
+}
+
static inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n)
{
- if (access_ok(VERIFY_READ, from, n))
- n = __copy_from_user(to, from, n);
- else /* security hole - plug it */
+ if (access_ok(VERIFY_READ, from, n)) {
+ check_object_size(to, n, false);
+ n = __arch_copy_from_user(to, from, n);
+ } else /* security hole - plug it */
memset(to, 0, n);
return n;
}
static inline unsigned long __must_check copy_to_user(void __user *to, const void *from, unsigned long n)
{
- if (access_ok(VERIFY_WRITE, to, n))
- n = __copy_to_user(to, from, n);
+ if (access_ok(VERIFY_WRITE, to, n)) {
+ check_object_size(from, n, true);
+ n = __arch_copy_to_user(to, from, n);
+ }
return n;
}
@@ -299,4 +446,77 @@
extern __must_check long strlen_user(const char __user *str);
extern __must_check long strnlen_user(const char __user *str, long n);
+#else /* __ASSEMBLY__ */
+
+#include <asm/assembler.h>
+
+/*
+ * User access enabling/disabling macros.
+ */
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ .macro __uaccess_ttbr0_disable, tmp1
+ mrs \tmp1, ttbr1_el1 // swapper_pg_dir
+ bic \tmp1, \tmp1, #TTBR_ASID_MASK
+ add \tmp1, \tmp1, #SWAPPER_DIR_SIZE // reserved_ttbr0 at the end of swapper_pg_dir
+ msr ttbr0_el1, \tmp1 // set reserved TTBR0_EL1
+ isb
+ sub \tmp1, \tmp1, #SWAPPER_DIR_SIZE
+ msr ttbr1_el1, \tmp1 // set reserved ASID
+ isb
+ .endm
+
+ .macro __uaccess_ttbr0_enable, tmp1, tmp2
+ get_thread_info \tmp1
+ ldr \tmp1, [\tmp1, #TSK_TI_TTBR0] // load saved TTBR0_EL1
+ mrs \tmp2, ttbr1_el1
+ extr \tmp2, \tmp2, \tmp1, #48
+ ror \tmp2, \tmp2, #16
+ msr ttbr1_el1, \tmp2 // set the active ASID
+ isb
+ msr ttbr0_el1, \tmp1 // set the non-PAN TTBR0_EL1
+ isb
+ .endm
+
+ .macro uaccess_ttbr0_disable, tmp1, tmp2
+alternative_if_not ARM64_HAS_PAN
+ save_and_disable_irq \tmp2 // avoid preemption
+ __uaccess_ttbr0_disable \tmp1
+ restore_irq \tmp2
+alternative_else_nop_endif
+ .endm
+
+ .macro uaccess_ttbr0_enable, tmp1, tmp2, tmp3
+alternative_if_not ARM64_HAS_PAN
+ save_and_disable_irq \tmp3 // avoid preemption
+ __uaccess_ttbr0_enable \tmp1, \tmp2
+ restore_irq \tmp3
+alternative_else_nop_endif
+ .endm
+#else
+ .macro uaccess_ttbr0_disable, tmp1, tmp2
+ .endm
+
+ .macro uaccess_ttbr0_enable, tmp1, tmp2, tmp3
+ .endm
+#endif
+
+/*
+ * These macros are no-ops when UAO is present.
+ */
+ .macro uaccess_disable_not_uao, tmp1, tmp2
+ uaccess_ttbr0_disable \tmp1, \tmp2
+alternative_if ARM64_ALT_PAN_NOT_UAO
+ SET_PSTATE_PAN(1)
+alternative_else_nop_endif
+ .endm
+
+ .macro uaccess_enable_not_uao, tmp1, tmp2, tmp3
+ uaccess_ttbr0_enable \tmp1, \tmp2, \tmp3
+alternative_if ARM64_ALT_PAN_NOT_UAO
+ SET_PSTATE_PAN(0)
+alternative_else_nop_endif
+ .endm
+
+#endif /* __ASSEMBLY__ */
+
#endif /* __ASM_UACCESS_H */
diff --git a/arch/arm64/include/asm/vdso_datapage.h b/arch/arm64/include/asm/vdso_datapage.h
index de66199..2b9a637 100644
--- a/arch/arm64/include/asm/vdso_datapage.h
+++ b/arch/arm64/include/asm/vdso_datapage.h
@@ -22,6 +22,8 @@
struct vdso_data {
__u64 cs_cycle_last; /* Timebase at clocksource init */
+ __u64 raw_time_sec; /* Raw time */
+ __u64 raw_time_nsec;
__u64 xtime_clock_sec; /* Kernel time */
__u64 xtime_clock_nsec;
__u64 xtime_coarse_sec; /* Coarse time */
@@ -29,8 +31,10 @@
__u64 wtm_clock_sec; /* Wall to monotonic time */
__u64 wtm_clock_nsec;
__u32 tb_seq_count; /* Timebase sequence counter */
- __u32 cs_mult; /* Clocksource multiplier */
- __u32 cs_shift; /* Clocksource shift */
+ /* cs_* members must be adjacent and in this order (ldp accesses) */
+ __u32 cs_mono_mult; /* NTP-adjusted clocksource multiplier */
+ __u32 cs_shift; /* Clocksource shift (mono = raw) */
+ __u32 cs_raw_mult; /* Raw clocksource multiplier */
__u32 tz_minuteswest; /* Whacky timezone stuff */
__u32 tz_dsttime;
__u32 use_syscall;
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index aab5bf0..2b79b8a 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -16,6 +16,8 @@
#ifndef __ASM_WORD_AT_A_TIME_H
#define __ASM_WORD_AT_A_TIME_H
+#include <asm/uaccess.h>
+
#ifndef __AARCH64EB__
#include <linux/kernel.h>
@@ -81,10 +83,7 @@
#endif
" b 2b\n"
" .popsection\n"
- " .pushsection __ex_table,\"a\"\n"
- " .align 3\n"
- " .quad 1b, 3b\n"
- " .popsection"
+ _ASM_EXTABLE(1b, 3b)
: "=&r" (ret), "=&r" (offset)
: "r" (addr), "Q" (*(unsigned long *)addr));
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index 3378238..d1ff83d 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -45,6 +45,7 @@
#define PSR_A_BIT 0x00000100
#define PSR_D_BIT 0x00000200
#define PSR_PAN_BIT 0x00400000
+#define PSR_UAO_BIT 0x00800000
#define PSR_Q_BIT 0x08000000
#define PSR_V_BIT 0x10000000
#define PSR_C_BIT 0x20000000
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..d100343 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -14,10 +14,10 @@
arm64-obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
entry-fpsimd.o process.o ptrace.o setup.o signal.o \
sys.o stacktrace.o time.o traps.o io.o vdso.o \
- hyp-stub.o psci.o psci-call.o cpu_ops.o insn.o \
+ hyp-stub.o psci.o cpu_ops.o insn.o \
return_address.o cpuinfo.o cpu_errata.o \
cpufeature.o alternative.o cacheinfo.o \
- smp.o smp_spin_table.o topology.o
+ smp.o smp_spin_table.o topology.o smccc-call.o
extra-$(CONFIG_EFI) := efi-entry.o
@@ -30,6 +30,7 @@
../../arm/kernel/opcodes.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
+arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
@@ -41,12 +42,10 @@
arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
arm64-obj-$(CONFIG_ACPI) += acpi.o
+arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o
+arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(arm64-obj-y) vdso/
obj-m += $(arm64-obj-m)
head-y := head.o
extra-y += $(head-y) vmlinux.lds
-
-# vDSO - this must be built first to generate the symbol offsets
-$(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h
-$(obj)/vdso/vdso-offsets.h: $(obj)/vdso
diff --git a/arch/arm64/kernel/acpi_parking_protocol.c b/arch/arm64/kernel/acpi_parking_protocol.c
new file mode 100644
index 0000000..89c96bd
--- /dev/null
+++ b/arch/arm64/kernel/acpi_parking_protocol.c
@@ -0,0 +1,154 @@
+/*
+ * ARM64 ACPI Parking Protocol implementation
+ *
+ * Authors: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ * Mark Salter <msalter@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/acpi.h>
+#include <linux/mm.h>
+#include <linux/types.h>
+
+#include <asm/cpu_ops.h>
+
+struct cpu_mailbox_entry {
+ phys_addr_t mailbox_addr;
+ u8 version;
+ u8 gic_cpu_id;
+};
+
+static struct cpu_mailbox_entry cpu_mailbox_entries[NR_CPUS];
+
+void __init acpi_set_mailbox_entry(int cpu,
+ struct acpi_madt_generic_interrupt *p)
+{
+ struct cpu_mailbox_entry *cpu_entry = &cpu_mailbox_entries[cpu];
+
+ cpu_entry->mailbox_addr = p->parked_address;
+ cpu_entry->version = p->parking_version;
+ cpu_entry->gic_cpu_id = p->cpu_interface_number;
+}
+
+bool acpi_parking_protocol_valid(int cpu)
+{
+ struct cpu_mailbox_entry *cpu_entry = &cpu_mailbox_entries[cpu];
+
+ return cpu_entry->mailbox_addr && cpu_entry->version;
+}
+
+static int acpi_parking_protocol_cpu_init(unsigned int cpu)
+{
+ pr_debug("%s: ACPI parked addr=%llx\n", __func__,
+ cpu_mailbox_entries[cpu].mailbox_addr);
+
+ return 0;
+}
+
+static int acpi_parking_protocol_cpu_prepare(unsigned int cpu)
+{
+ return 0;
+}
+
+struct parking_protocol_mailbox {
+ __le32 cpu_id;
+ __le32 reserved;
+ __le64 entry_point;
+};
+
+static int acpi_parking_protocol_cpu_boot(unsigned int cpu)
+{
+ struct cpu_mailbox_entry *cpu_entry = &cpu_mailbox_entries[cpu];
+ struct parking_protocol_mailbox __iomem *mailbox;
+ __le32 cpu_id;
+
+ /*
+ * Map mailbox memory with attribute device nGnRE (ie ioremap -
+ * this deviates from the parking protocol specifications since
+ * the mailboxes are required to be mapped nGnRnE; the attribute
+ * discrepancy is harmless insofar as the protocol specification
+ * is concerned).
+ * If the mailbox is mistakenly allocated in the linear mapping
+ * by FW ioremap will fail since the mapping will be prevented
+ * by the kernel (it clashes with the linear mapping attributes
+ * specifications).
+ */
+ mailbox = ioremap(cpu_entry->mailbox_addr, sizeof(*mailbox));
+ if (!mailbox)
+ return -EIO;
+
+ cpu_id = readl_relaxed(&mailbox->cpu_id);
+ /*
+ * Check if firmware has set-up the mailbox entry properly
+ * before kickstarting the respective cpu.
+ */
+ if (cpu_id != ~0U) {
+ iounmap(mailbox);
+ return -ENXIO;
+ }
+
+ /*
+ * We write the entry point and cpu id as LE regardless of the
+ * native endianness of the kernel. Therefore, any boot-loaders
+ * that read this address need to convert this address to the
+ * Boot-Loader's endianness before jumping.
+ */
+ writeq_relaxed(__pa_symbol(secondary_entry), &mailbox->entry_point);
+ writel_relaxed(cpu_entry->gic_cpu_id, &mailbox->cpu_id);
+
+ arch_send_wakeup_ipi_mask(cpumask_of(cpu));
+
+ iounmap(mailbox);
+
+ return 0;
+}
+
+static void acpi_parking_protocol_cpu_postboot(void)
+{
+ int cpu = smp_processor_id();
+ struct cpu_mailbox_entry *cpu_entry = &cpu_mailbox_entries[cpu];
+ struct parking_protocol_mailbox __iomem *mailbox;
+ __le64 entry_point;
+
+ /*
+ * Map mailbox memory with attribute device nGnRE (ie ioremap -
+ * this deviates from the parking protocol specifications since
+ * the mailboxes are required to be mapped nGnRnE; the attribute
+ * discrepancy is harmless insofar as the protocol specification
+ * is concerned).
+ * If the mailbox is mistakenly allocated in the linear mapping
+ * by FW ioremap will fail since the mapping will be prevented
+ * by the kernel (it clashes with the linear mapping attributes
+ * specifications).
+ */
+ mailbox = ioremap(cpu_entry->mailbox_addr, sizeof(*mailbox));
+ if (!mailbox)
+ return;
+
+ entry_point = readl_relaxed(&mailbox->entry_point);
+ /*
+ * Check if firmware has cleared the entry_point as expected
+ * by the protocol specification.
+ */
+ WARN_ON(entry_point);
+
+ iounmap(mailbox);
+}
+
+const struct cpu_operations acpi_parking_protocol_ops = {
+ .name = "parking-protocol",
+ .cpu_init = acpi_parking_protocol_cpu_init,
+ .cpu_prepare = acpi_parking_protocol_cpu_prepare,
+ .cpu_boot = acpi_parking_protocol_cpu_boot,
+ .cpu_postboot = acpi_parking_protocol_cpu_postboot
+};
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index ab9db0e..d2ee1b2 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -158,9 +158,3 @@
__apply_alternatives(®ion);
}
-
-void free_alternatives_memory(void)
-{
- free_reserved_area(__alt_instructions, __alt_instructions_end,
- 0, "alternatives");
-}
diff --git a/arch/arm64/kernel/arm64ksyms.c b/arch/arm64/kernel/arm64ksyms.c
index 3b6d8cc..2dc4440 100644
--- a/arch/arm64/kernel/arm64ksyms.c
+++ b/arch/arm64/kernel/arm64ksyms.c
@@ -26,6 +26,7 @@
#include <linux/syscalls.h>
#include <linux/uaccess.h>
#include <linux/io.h>
+#include <linux/arm-smccc.h>
#include <asm/checksum.h>
@@ -33,8 +34,8 @@
EXPORT_SYMBOL(clear_page);
/* user mem (segment) */
-EXPORT_SYMBOL(__copy_from_user);
-EXPORT_SYMBOL(__copy_to_user);
+EXPORT_SYMBOL(__arch_copy_from_user);
+EXPORT_SYMBOL(__arch_copy_to_user);
EXPORT_SYMBOL(__clear_user);
EXPORT_SYMBOL(__copy_in_user);
@@ -68,3 +69,7 @@
#ifdef CONFIG_FUNCTION_TRACER
EXPORT_SYMBOL(_mcount);
#endif
+
+ /* arm-smccc */
+EXPORT_SYMBOL(arm_smccc_smc);
+EXPORT_SYMBOL(arm_smccc_hvc);
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 478a00b..185ac13 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -14,7 +14,6 @@
#include <linux/slab.h>
#include <linux/sysctl.h>
-#include <asm/alternative.h>
#include <asm/cpufeature.h>
#include <asm/insn.h>
#include <asm/opcodes.h>
@@ -62,7 +61,7 @@
};
static LIST_HEAD(insn_emulation);
-static int nr_insn_emulated;
+static int nr_insn_emulated __initdata;
static DEFINE_RAW_SPINLOCK(insn_emulation_lock);
static void register_emulation_hooks(struct insn_emulation_ops *ops)
@@ -173,7 +172,7 @@
return ret;
}
-static void register_insn_emulation(struct insn_emulation_ops *ops)
+static void __init register_insn_emulation(struct insn_emulation_ops *ops)
{
unsigned long flags;
struct insn_emulation *insn;
@@ -237,7 +236,7 @@
{ }
};
-static void register_insn_emulation_sysctl(struct ctl_table *table)
+static void __init register_insn_emulation_sysctl(struct ctl_table *table)
{
unsigned long flags;
int i = 0;
@@ -281,9 +280,9 @@
* Error-checking SWP macros implemented using ldxr{b}/stxr{b}
*/
#define __user_swpX_asm(data, addr, res, temp, B) \
+do { \
+ uaccess_enable(); \
__asm__ __volatile__( \
- ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN) \
"0: ldxr"B" %w2, [%3]\n" \
"1: stxr"B" %w0, %w1, [%3]\n" \
" cbz %w0, 2f\n" \
@@ -297,17 +296,14 @@
"4: mov %w0, %w5\n" \
" b 3b\n" \
" .popsection" \
- " .pushsection __ex_table,\"a\"\n" \
- " .align 3\n" \
- " .quad 0b, 4b\n" \
- " .quad 1b, 4b\n" \
- " .popsection\n" \
- ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
- CONFIG_ARM64_PAN) \
+ _ASM_EXTABLE(0b, 4b) \
+ _ASM_EXTABLE(1b, 4b) \
: "=&r" (res), "+r" (data), "=&r" (temp) \
: "r" ((unsigned long)addr), "i" (-EAGAIN), \
"i" (-EFAULT) \
- : "memory")
+ : "memory"); \
+ uaccess_disable(); \
+} while (0)
#define __user_swp_asm(data, addr, res, temp) \
__user_swpX_asm(data, addr, res, temp, "")
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 087cf9a..3fa949f 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,22 +22,32 @@
#include <linux/mm.h>
#include <linux/dma-mapping.h>
#include <linux/kvm_host.h>
+#include <asm/fixmap.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/smp_plat.h>
#include <asm/suspend.h>
#include <asm/vdso_datapage.h>
#include <linux/kbuild.h>
+#include <linux/arm-smccc.h>
int main(void)
{
DEFINE(TSK_ACTIVE_MM, offsetof(struct task_struct, active_mm));
BLANK();
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ DEFINE(TSK_TI_FLAGS, offsetof(struct task_struct, thread_info.flags));
+ DEFINE(TSK_TI_PREEMPT, offsetof(struct task_struct, thread_info.preempt_count));
+ DEFINE(TSK_TI_ADDR_LIMIT, offsetof(struct task_struct, thread_info.addr_limit));
+ DEFINE(TSK_STACK, offsetof(struct task_struct, stack));
+#else
DEFINE(TI_FLAGS, offsetof(struct thread_info, flags));
DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count));
DEFINE(TI_ADDR_LIMIT, offsetof(struct thread_info, addr_limit));
- DEFINE(TI_TASK, offsetof(struct thread_info, task));
- DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
+#endif
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ DEFINE(TSK_TI_TTBR0, offsetof(struct thread_info, ttbr0));
+#endif
BLANK();
DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context));
BLANK();
@@ -76,6 +86,7 @@
BLANK();
DEFINE(CLOCK_REALTIME, CLOCK_REALTIME);
DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC);
+ DEFINE(CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC_RAW);
DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC);
DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
DEFINE(CLOCK_MONOTONIC_COARSE,CLOCK_MONOTONIC_COARSE);
@@ -83,6 +94,8 @@
DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
BLANK();
DEFINE(VDSO_CS_CYCLE_LAST, offsetof(struct vdso_data, cs_cycle_last));
+ DEFINE(VDSO_RAW_TIME_SEC, offsetof(struct vdso_data, raw_time_sec));
+ DEFINE(VDSO_RAW_TIME_NSEC, offsetof(struct vdso_data, raw_time_nsec));
DEFINE(VDSO_XTIME_CLK_SEC, offsetof(struct vdso_data, xtime_clock_sec));
DEFINE(VDSO_XTIME_CLK_NSEC, offsetof(struct vdso_data, xtime_clock_nsec));
DEFINE(VDSO_XTIME_CRS_SEC, offsetof(struct vdso_data, xtime_coarse_sec));
@@ -90,7 +103,8 @@
DEFINE(VDSO_WTM_CLK_SEC, offsetof(struct vdso_data, wtm_clock_sec));
DEFINE(VDSO_WTM_CLK_NSEC, offsetof(struct vdso_data, wtm_clock_nsec));
DEFINE(VDSO_TB_SEQ_COUNT, offsetof(struct vdso_data, tb_seq_count));
- DEFINE(VDSO_CS_MULT, offsetof(struct vdso_data, cs_mult));
+ DEFINE(VDSO_CS_MONO_MULT, offsetof(struct vdso_data, cs_mono_mult));
+ DEFINE(VDSO_CS_RAW_MULT, offsetof(struct vdso_data, cs_raw_mult));
DEFINE(VDSO_CS_SHIFT, offsetof(struct vdso_data, cs_shift));
DEFINE(VDSO_TZ_MINWEST, offsetof(struct vdso_data, tz_minuteswest));
DEFINE(VDSO_TZ_DSTTIME, offsetof(struct vdso_data, tz_dsttime));
@@ -104,6 +118,11 @@
DEFINE(TZ_MINWEST, offsetof(struct timezone, tz_minuteswest));
DEFINE(TZ_DSTTIME, offsetof(struct timezone, tz_dsttime));
BLANK();
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ DEFINE(CPU_BOOT_STACK, offsetof(struct secondary_data, stack));
+ DEFINE(CPU_BOOT_TASK, offsetof(struct secondary_data, task));
+ BLANK();
+#endif
#ifdef CONFIG_KVM_ARM_HOST
DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
DEFINE(CPU_GP_REGS, offsetof(struct kvm_cpu_context, gp_regs));
@@ -162,5 +181,11 @@
DEFINE(SLEEP_SAVE_SP_PHYS, offsetof(struct sleep_save_sp, save_ptr_stash_phys));
DEFINE(SLEEP_SAVE_SP_VIRT, offsetof(struct sleep_save_sp, save_ptr_stash));
#endif
+ DEFINE(ARM_SMCCC_RES_X0_OFFS, offsetof(struct arm_smccc_res, a0));
+ DEFINE(ARM_SMCCC_RES_X2_OFFS, offsetof(struct arm_smccc_res, a2));
+ BLANK();
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ DEFINE(TRAMP_VALIAS, TRAMP_VALIAS);
+#endif
return 0;
}
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index a3e846a..06afd04 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -21,24 +21,12 @@
#include <asm/cputype.h>
#include <asm/cpufeature.h>
-#define MIDR_CORTEX_A53 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
-#define MIDR_CORTEX_A57 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
-#define MIDR_THUNDERX MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
-
-#define CPU_MODEL_MASK (MIDR_IMPLEMENTOR_MASK | MIDR_PARTNUM_MASK | \
- MIDR_ARCHITECTURE_MASK)
-
static bool __maybe_unused
is_affected_midr_range(const struct arm64_cpu_capabilities *entry)
{
- u32 midr = read_cpuid_id();
-
- if ((midr & CPU_MODEL_MASK) != entry->midr_model)
- return false;
-
- midr &= MIDR_REVISION_MASK | MIDR_VARIANT_MASK;
-
- return (midr >= entry->midr_range_min && midr <= entry->midr_range_max);
+ return MIDR_IS_CPU_MODEL_RANGE(read_cpuid_id(), entry->midr_model,
+ entry->midr_range_min,
+ entry->midr_range_max);
}
#define MIDR_RANGE(model, min, max) \
diff --git a/arch/arm64/kernel/cpu_ops.c b/arch/arm64/kernel/cpu_ops.c
index b6bd7d4..c7cfb8f 100644
--- a/arch/arm64/kernel/cpu_ops.c
+++ b/arch/arm64/kernel/cpu_ops.c
@@ -25,19 +25,30 @@
#include <asm/smp_plat.h>
extern const struct cpu_operations smp_spin_table_ops;
+extern const struct cpu_operations acpi_parking_protocol_ops;
extern const struct cpu_operations cpu_psci_ops;
const struct cpu_operations *cpu_ops[NR_CPUS];
-static const struct cpu_operations *supported_cpu_ops[] __initconst = {
+static const struct cpu_operations *dt_supported_cpu_ops[] __initconst = {
&smp_spin_table_ops,
&cpu_psci_ops,
NULL,
};
+static const struct cpu_operations *acpi_supported_cpu_ops[] __initconst = {
+#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
+ &acpi_parking_protocol_ops,
+#endif
+ &cpu_psci_ops,
+ NULL,
+};
+
static const struct cpu_operations * __init cpu_get_ops(const char *name)
{
- const struct cpu_operations **ops = supported_cpu_ops;
+ const struct cpu_operations **ops;
+
+ ops = acpi_disabled ? dt_supported_cpu_ops : acpi_supported_cpu_ops;
while (*ops) {
if (!strcmp(name, (*ops)->name))
@@ -75,8 +86,16 @@
}
} else {
enable_method = acpi_get_enable_method(cpu);
- if (!enable_method)
- pr_err("Unsupported ACPI enable-method\n");
+ if (!enable_method) {
+ /*
+ * In ACPI systems the boot CPU does not require
+ * checking the enable method since for some
+ * boot protocol (ie parking protocol) it need not
+ * be initialized. Don't warn spuriously.
+ */
+ if (cpu != 0)
+ pr_err("Unsupported ACPI enable-method\n");
+ }
}
return enable_method;
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 2735bf8..0e34add 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -23,6 +23,7 @@
#include <linux/sort.h>
#include <linux/stop_machine.h>
#include <linux/types.h>
+#include <linux/mm.h>
#include <asm/cpu.h>
#include <asm/cpufeature.h>
#include <asm/cpu_ops.h>
@@ -45,6 +46,7 @@
#endif
DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
+EXPORT_SYMBOL(cpu_hwcaps);
#define __ARM64_FTR_BITS(SIGNED, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
{ \
@@ -69,6 +71,10 @@
.width = 0, \
}
+/* meta feature for alternatives */
+static bool __maybe_unused
+cpufeature_pan_not_uao(const struct arm64_cpu_capabilities *entry);
+
static struct arm64_ftr_bits ftr_id_aa64isar0[] = {
ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 32, 32, 0),
ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64ISAR0_RDM_SHIFT, 4, 0),
@@ -125,6 +131,11 @@
ARM64_FTR_END,
};
+static struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
+ ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64MMFR2_UAO_SHIFT, 4, 0),
+ ARM64_FTR_END,
+};
+
static struct arm64_ftr_bits ftr_ctr[] = {
U_ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RAO */
ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 28, 3, 0),
@@ -286,6 +297,7 @@
/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1),
+ ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2),
/* Op1 = 3, CRn = 0, CRm = 0 */
ARM64_FTR_REG(SYS_CTR_EL0, ftr_ctr),
@@ -410,6 +422,7 @@
init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1);
init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0);
init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1);
+ init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);
init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);
@@ -519,6 +532,8 @@
info->reg_id_aa64mmfr0, boot->reg_id_aa64mmfr0);
taint |= check_update_ftr_reg(SYS_ID_AA64MMFR1_EL1, cpu,
info->reg_id_aa64mmfr1, boot->reg_id_aa64mmfr1);
+ taint |= check_update_ftr_reg(SYS_ID_AA64MMFR2_EL1, cpu,
+ info->reg_id_aa64mmfr2, boot->reg_id_aa64mmfr2);
/*
* EL3 is not our concern.
@@ -623,6 +638,51 @@
return has_sre;
}
+static bool has_no_hw_prefetch(const struct arm64_cpu_capabilities *entry)
+{
+ u32 midr = read_cpuid_id();
+ u32 rv_min, rv_max;
+
+ /* Cavium ThunderX pass 1.x and 2.x */
+ rv_min = 0;
+ rv_max = (1 << MIDR_VARIANT_SHIFT) | MIDR_REVISION_MASK;
+
+ return MIDR_IS_CPU_MODEL_RANGE(midr, MIDR_THUNDERX, rv_min, rv_max);
+}
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
+
+static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry)
+{
+ /* Forced on command line? */
+ if (__kpti_forced) {
+ pr_info_once("kernel page table isolation forced %s by command line option\n",
+ __kpti_forced > 0 ? "ON" : "OFF");
+ return __kpti_forced > 0;
+ }
+
+ /* Useful for KASLR robustness */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+ return true;
+
+ return false;
+}
+
+static int __init parse_kpti(char *str)
+{
+ bool enabled;
+ int ret = strtobool(str, &enabled);
+
+ if (ret)
+ return ret;
+
+ __kpti_forced = enabled ? 1 : -1;
+ return 0;
+}
+__setup("kpti=", parse_kpti);
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -653,6 +713,34 @@
.min_field_value = 2,
},
#endif /* CONFIG_AS_LSE && CONFIG_ARM64_LSE_ATOMICS */
+ {
+ .desc = "Software prefetching using PRFM",
+ .capability = ARM64_HAS_NO_HW_PREFETCH,
+ .matches = has_no_hw_prefetch,
+ },
+#ifdef CONFIG_ARM64_UAO
+ {
+ .desc = "User Access Override",
+ .capability = ARM64_HAS_UAO,
+ .matches = has_cpuid_feature,
+ .sys_reg = SYS_ID_AA64MMFR2_EL1,
+ .field_pos = ID_AA64MMFR2_UAO_SHIFT,
+ .min_field_value = 1,
+ .enable = cpu_enable_uao,
+ },
+#endif /* CONFIG_ARM64_UAO */
+#ifdef CONFIG_ARM64_PAN
+ {
+ .capability = ARM64_ALT_PAN_NOT_UAO,
+ .matches = cpufeature_pan_not_uao,
+ },
+#endif /* CONFIG_ARM64_PAN */
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ {
+ .capability = ARM64_UNMAP_KERNEL_AT_EL0,
+ .matches = unmap_kernel_at_el0,
+ },
+#endif
{},
};
@@ -686,7 +774,7 @@
{},
};
-static void cap_set_hwcap(const struct arm64_cpu_capabilities *cap)
+static void __init cap_set_hwcap(const struct arm64_cpu_capabilities *cap)
{
switch (cap->hwcap_type) {
case CAP_HWCAP:
@@ -731,12 +819,12 @@
return rc;
}
-static void setup_cpu_hwcaps(void)
+static void __init setup_cpu_hwcaps(void)
{
int i;
const struct arm64_cpu_capabilities *hwcaps = arm64_hwcaps;
- for (i = 0; hwcaps[i].desc; i++)
+ for (i = 0; hwcaps[i].matches; i++)
if (hwcaps[i].matches(&hwcaps[i]))
cap_set_hwcap(&hwcaps[i]);
}
@@ -746,11 +834,11 @@
{
int i;
- for (i = 0; caps[i].desc; i++) {
+ for (i = 0; caps[i].matches; i++) {
if (!caps[i].matches(&caps[i]))
continue;
- if (!cpus_have_cap(caps[i].capability))
+ if (!cpus_have_cap(caps[i].capability) && caps[i].desc)
pr_info("%s %s\n", info, caps[i].desc);
cpus_set_cap(caps[i].capability);
}
@@ -760,11 +848,12 @@
* Run through the enabled capabilities and enable() it on all active
* CPUs
*/
-static void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
+static void __init
+enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
{
int i;
- for (i = 0; caps[i].desc; i++)
+ for (i = 0; caps[i].matches; i++)
if (caps[i].enable && cpus_have_cap(caps[i].capability))
/*
* Use stop_machine() as it schedules the work allowing
@@ -798,35 +887,36 @@
static u64 __raw_read_system_reg(u32 sys_id)
{
switch (sys_id) {
- case SYS_ID_PFR0_EL1: return (u64)read_cpuid(ID_PFR0_EL1);
- case SYS_ID_PFR1_EL1: return (u64)read_cpuid(ID_PFR1_EL1);
- case SYS_ID_DFR0_EL1: return (u64)read_cpuid(ID_DFR0_EL1);
- case SYS_ID_MMFR0_EL1: return (u64)read_cpuid(ID_MMFR0_EL1);
- case SYS_ID_MMFR1_EL1: return (u64)read_cpuid(ID_MMFR1_EL1);
- case SYS_ID_MMFR2_EL1: return (u64)read_cpuid(ID_MMFR2_EL1);
- case SYS_ID_MMFR3_EL1: return (u64)read_cpuid(ID_MMFR3_EL1);
- case SYS_ID_ISAR0_EL1: return (u64)read_cpuid(ID_ISAR0_EL1);
- case SYS_ID_ISAR1_EL1: return (u64)read_cpuid(ID_ISAR1_EL1);
- case SYS_ID_ISAR2_EL1: return (u64)read_cpuid(ID_ISAR2_EL1);
- case SYS_ID_ISAR3_EL1: return (u64)read_cpuid(ID_ISAR3_EL1);
- case SYS_ID_ISAR4_EL1: return (u64)read_cpuid(ID_ISAR4_EL1);
- case SYS_ID_ISAR5_EL1: return (u64)read_cpuid(ID_ISAR4_EL1);
- case SYS_MVFR0_EL1: return (u64)read_cpuid(MVFR0_EL1);
- case SYS_MVFR1_EL1: return (u64)read_cpuid(MVFR1_EL1);
- case SYS_MVFR2_EL1: return (u64)read_cpuid(MVFR2_EL1);
+ case SYS_ID_PFR0_EL1: return read_cpuid(SYS_ID_PFR0_EL1);
+ case SYS_ID_PFR1_EL1: return read_cpuid(SYS_ID_PFR1_EL1);
+ case SYS_ID_DFR0_EL1: return read_cpuid(SYS_ID_DFR0_EL1);
+ case SYS_ID_MMFR0_EL1: return read_cpuid(SYS_ID_MMFR0_EL1);
+ case SYS_ID_MMFR1_EL1: return read_cpuid(SYS_ID_MMFR1_EL1);
+ case SYS_ID_MMFR2_EL1: return read_cpuid(SYS_ID_MMFR2_EL1);
+ case SYS_ID_MMFR3_EL1: return read_cpuid(SYS_ID_MMFR3_EL1);
+ case SYS_ID_ISAR0_EL1: return read_cpuid(SYS_ID_ISAR0_EL1);
+ case SYS_ID_ISAR1_EL1: return read_cpuid(SYS_ID_ISAR1_EL1);
+ case SYS_ID_ISAR2_EL1: return read_cpuid(SYS_ID_ISAR2_EL1);
+ case SYS_ID_ISAR3_EL1: return read_cpuid(SYS_ID_ISAR3_EL1);
+ case SYS_ID_ISAR4_EL1: return read_cpuid(SYS_ID_ISAR4_EL1);
+ case SYS_ID_ISAR5_EL1: return read_cpuid(SYS_ID_ISAR4_EL1);
+ case SYS_MVFR0_EL1: return read_cpuid(SYS_MVFR0_EL1);
+ case SYS_MVFR1_EL1: return read_cpuid(SYS_MVFR1_EL1);
+ case SYS_MVFR2_EL1: return read_cpuid(SYS_MVFR2_EL1);
- case SYS_ID_AA64PFR0_EL1: return (u64)read_cpuid(ID_AA64PFR0_EL1);
- case SYS_ID_AA64PFR1_EL1: return (u64)read_cpuid(ID_AA64PFR0_EL1);
- case SYS_ID_AA64DFR0_EL1: return (u64)read_cpuid(ID_AA64DFR0_EL1);
- case SYS_ID_AA64DFR1_EL1: return (u64)read_cpuid(ID_AA64DFR0_EL1);
- case SYS_ID_AA64MMFR0_EL1: return (u64)read_cpuid(ID_AA64MMFR0_EL1);
- case SYS_ID_AA64MMFR1_EL1: return (u64)read_cpuid(ID_AA64MMFR1_EL1);
- case SYS_ID_AA64ISAR0_EL1: return (u64)read_cpuid(ID_AA64ISAR0_EL1);
- case SYS_ID_AA64ISAR1_EL1: return (u64)read_cpuid(ID_AA64ISAR1_EL1);
+ case SYS_ID_AA64PFR0_EL1: return read_cpuid(SYS_ID_AA64PFR0_EL1);
+ case SYS_ID_AA64PFR1_EL1: return read_cpuid(SYS_ID_AA64PFR0_EL1);
+ case SYS_ID_AA64DFR0_EL1: return read_cpuid(SYS_ID_AA64DFR0_EL1);
+ case SYS_ID_AA64DFR1_EL1: return read_cpuid(SYS_ID_AA64DFR0_EL1);
+ case SYS_ID_AA64MMFR0_EL1: return read_cpuid(SYS_ID_AA64MMFR0_EL1);
+ case SYS_ID_AA64MMFR1_EL1: return read_cpuid(SYS_ID_AA64MMFR1_EL1);
+ case SYS_ID_AA64MMFR2_EL1: return read_cpuid(SYS_ID_AA64MMFR2_EL1);
+ case SYS_ID_AA64ISAR0_EL1: return read_cpuid(SYS_ID_AA64ISAR0_EL1);
+ case SYS_ID_AA64ISAR1_EL1: return read_cpuid(SYS_ID_AA64ISAR1_EL1);
- case SYS_CNTFRQ_EL0: return (u64)read_cpuid(CNTFRQ_EL0);
- case SYS_CTR_EL0: return (u64)read_cpuid(CTR_EL0);
- case SYS_DCZID_EL0: return (u64)read_cpuid(DCZID_EL0);
+ case SYS_CNTFRQ_EL0: return read_cpuid(SYS_CNTFRQ_EL0);
+ case SYS_CTR_EL0: return read_cpuid(SYS_CTR_EL0);
+ case SYS_DCZID_EL0: return read_cpuid(SYS_DCZID_EL0);
default:
BUG();
return 0;
@@ -876,7 +966,7 @@
return;
caps = arm64_features;
- for (i = 0; caps[i].desc; i++) {
+ for (i = 0; caps[i].matches; i++) {
if (!cpus_have_cap(caps[i].capability) || !caps[i].sys_reg)
continue;
/*
@@ -889,7 +979,7 @@
caps[i].enable(NULL);
}
- for (i = 0, caps = arm64_hwcaps; caps[i].desc; i++) {
+ for (i = 0, caps = arm64_hwcaps; caps[i].matches; i++) {
if (!cpus_have_hwcap(&caps[i]))
continue;
if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i]))
@@ -905,7 +995,7 @@
#endif /* CONFIG_HOTPLUG_CPU */
-static void setup_feature_capabilities(void)
+static void __init setup_feature_capabilities(void)
{
update_cpu_capabilities(arm64_features, "detected feature:");
enable_cpu_capabilities(arm64_features);
@@ -935,3 +1025,9 @@
pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule (%d < %d)\n",
L1_CACHE_BYTES, cls);
}
+
+static bool __maybe_unused
+cpufeature_pan_not_uao(const struct arm64_cpu_capabilities *entry)
+{
+ return (cpus_have_cap(ARM64_HAS_PAN) && !cpus_have_cap(ARM64_HAS_UAO));
+}
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 0166cfb..95a6fae 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -208,35 +208,36 @@
{
info->reg_cntfrq = arch_timer_get_cntfrq();
info->reg_ctr = read_cpuid_cachetype();
- info->reg_dczid = read_cpuid(DCZID_EL0);
+ info->reg_dczid = read_cpuid(SYS_DCZID_EL0);
info->reg_midr = read_cpuid_id();
- info->reg_id_aa64dfr0 = read_cpuid(ID_AA64DFR0_EL1);
- info->reg_id_aa64dfr1 = read_cpuid(ID_AA64DFR1_EL1);
- info->reg_id_aa64isar0 = read_cpuid(ID_AA64ISAR0_EL1);
- info->reg_id_aa64isar1 = read_cpuid(ID_AA64ISAR1_EL1);
- info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1);
- info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
- info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
- info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
+ info->reg_id_aa64dfr0 = read_cpuid(SYS_ID_AA64DFR0_EL1);
+ info->reg_id_aa64dfr1 = read_cpuid(SYS_ID_AA64DFR1_EL1);
+ info->reg_id_aa64isar0 = read_cpuid(SYS_ID_AA64ISAR0_EL1);
+ info->reg_id_aa64isar1 = read_cpuid(SYS_ID_AA64ISAR1_EL1);
+ info->reg_id_aa64mmfr0 = read_cpuid(SYS_ID_AA64MMFR0_EL1);
+ info->reg_id_aa64mmfr1 = read_cpuid(SYS_ID_AA64MMFR1_EL1);
+ info->reg_id_aa64mmfr2 = read_cpuid(SYS_ID_AA64MMFR2_EL1);
+ info->reg_id_aa64pfr0 = read_cpuid(SYS_ID_AA64PFR0_EL1);
+ info->reg_id_aa64pfr1 = read_cpuid(SYS_ID_AA64PFR1_EL1);
- info->reg_id_dfr0 = read_cpuid(ID_DFR0_EL1);
- info->reg_id_isar0 = read_cpuid(ID_ISAR0_EL1);
- info->reg_id_isar1 = read_cpuid(ID_ISAR1_EL1);
- info->reg_id_isar2 = read_cpuid(ID_ISAR2_EL1);
- info->reg_id_isar3 = read_cpuid(ID_ISAR3_EL1);
- info->reg_id_isar4 = read_cpuid(ID_ISAR4_EL1);
- info->reg_id_isar5 = read_cpuid(ID_ISAR5_EL1);
- info->reg_id_mmfr0 = read_cpuid(ID_MMFR0_EL1);
- info->reg_id_mmfr1 = read_cpuid(ID_MMFR1_EL1);
- info->reg_id_mmfr2 = read_cpuid(ID_MMFR2_EL1);
- info->reg_id_mmfr3 = read_cpuid(ID_MMFR3_EL1);
- info->reg_id_pfr0 = read_cpuid(ID_PFR0_EL1);
- info->reg_id_pfr1 = read_cpuid(ID_PFR1_EL1);
+ info->reg_id_dfr0 = read_cpuid(SYS_ID_DFR0_EL1);
+ info->reg_id_isar0 = read_cpuid(SYS_ID_ISAR0_EL1);
+ info->reg_id_isar1 = read_cpuid(SYS_ID_ISAR1_EL1);
+ info->reg_id_isar2 = read_cpuid(SYS_ID_ISAR2_EL1);
+ info->reg_id_isar3 = read_cpuid(SYS_ID_ISAR3_EL1);
+ info->reg_id_isar4 = read_cpuid(SYS_ID_ISAR4_EL1);
+ info->reg_id_isar5 = read_cpuid(SYS_ID_ISAR5_EL1);
+ info->reg_id_mmfr0 = read_cpuid(SYS_ID_MMFR0_EL1);
+ info->reg_id_mmfr1 = read_cpuid(SYS_ID_MMFR1_EL1);
+ info->reg_id_mmfr2 = read_cpuid(SYS_ID_MMFR2_EL1);
+ info->reg_id_mmfr3 = read_cpuid(SYS_ID_MMFR3_EL1);
+ info->reg_id_pfr0 = read_cpuid(SYS_ID_PFR0_EL1);
+ info->reg_id_pfr1 = read_cpuid(SYS_ID_PFR1_EL1);
- info->reg_mvfr0 = read_cpuid(MVFR0_EL1);
- info->reg_mvfr1 = read_cpuid(MVFR1_EL1);
- info->reg_mvfr2 = read_cpuid(MVFR2_EL1);
+ info->reg_mvfr0 = read_cpuid(SYS_MVFR0_EL1);
+ info->reg_mvfr1 = read_cpuid(SYS_MVFR1_EL1);
+ info->reg_mvfr2 = read_cpuid(SYS_MVFR2_EL1);
cpuinfo_detect_icache_policy(info);
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index a773db9..f82036e 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
- ldr x21, =stext_offset
+ movz x21, #:abs_g0:stext_offset
add x21, x0, x21
/*
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 4eeb171..b6abc85 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -11,317 +11,34 @@
*
*/
-#include <linux/atomic.h>
#include <linux/dmi.h>
#include <linux/efi.h>
-#include <linux/export.h>
-#include <linux/memblock.h>
-#include <linux/mm_types.h>
-#include <linux/bootmem.h>
-#include <linux/of.h>
-#include <linux/of_fdt.h>
-#include <linux/preempt.h>
-#include <linux/rbtree.h>
-#include <linux/rwsem.h>
-#include <linux/sched.h>
-#include <linux/slab.h>
-#include <linux/spinlock.h>
+#include <linux/init.h>
-#include <asm/cacheflush.h>
#include <asm/efi.h>
-#include <asm/tlbflush.h>
-#include <asm/mmu_context.h>
-#include <asm/mmu.h>
-#include <asm/pgtable.h>
-struct efi_memory_map memmap;
-
-static u64 efi_system_table;
-
-static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
-
-static struct mm_struct efi_mm = {
- .mm_rb = RB_ROOT,
- .pgd = efi_pgd,
- .mm_users = ATOMIC_INIT(2),
- .mm_count = ATOMIC_INIT(1),
- .mmap_sem = __RWSEM_INITIALIZER(efi_mm.mmap_sem),
- .page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
- .mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
-};
-
-static int __init is_normal_ram(efi_memory_desc_t *md)
+int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
{
- if (md->attribute & EFI_MEMORY_WB)
- return 1;
- return 0;
-}
-
-/*
- * Translate a EFI virtual address into a physical address: this is necessary,
- * as some data members of the EFI system table are virtually remapped after
- * SetVirtualAddressMap() has been called.
- */
-static phys_addr_t efi_to_phys(unsigned long addr)
-{
- efi_memory_desc_t *md;
-
- for_each_efi_memory_desc(&memmap, md) {
- if (!(md->attribute & EFI_MEMORY_RUNTIME))
- continue;
- if (md->virt_addr == 0)
- /* no virtual mapping has been installed by the stub */
- break;
- if (md->virt_addr <= addr &&
- (addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
- return md->phys_addr + addr - md->virt_addr;
- }
- return addr;
-}
-
-static int __init uefi_init(void)
-{
- efi_char16_t *c16;
- void *config_tables;
- u64 table_size;
- char vendor[100] = "unknown";
- int i, retval;
-
- efi.systab = early_memremap(efi_system_table,
- sizeof(efi_system_table_t));
- if (efi.systab == NULL) {
- pr_warn("Unable to map EFI system table.\n");
- return -ENOMEM;
- }
-
- set_bit(EFI_BOOT, &efi.flags);
- set_bit(EFI_64BIT, &efi.flags);
+ pteval_t prot_val;
/*
- * Verify the EFI Table
+ * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
+ * executable, everything else can be mapped with the XN bits
+ * set.
*/
- if (efi.systab->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) {
- pr_err("System table signature incorrect\n");
- retval = -EINVAL;
- goto out;
- }
- if ((efi.systab->hdr.revision >> 16) < 2)
- pr_warn("Warning: EFI system table version %d.%02d, expected 2.00 or greater\n",
- efi.systab->hdr.revision >> 16,
- efi.systab->hdr.revision & 0xffff);
+ if ((md->attribute & EFI_MEMORY_WB) == 0)
+ prot_val = PROT_DEVICE_nGnRE;
+ else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
+ !PAGE_ALIGNED(md->phys_addr))
+ prot_val = pgprot_val(PAGE_KERNEL_EXEC);
+ else
+ prot_val = pgprot_val(PAGE_KERNEL);
- /* Show what we know for posterity */
- c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
- sizeof(vendor) * sizeof(efi_char16_t));
- if (c16) {
- for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
- vendor[i] = c16[i];
- vendor[i] = '\0';
- early_memunmap(c16, sizeof(vendor) * sizeof(efi_char16_t));
- }
-
- pr_info("EFI v%u.%.02u by %s\n",
- efi.systab->hdr.revision >> 16,
- efi.systab->hdr.revision & 0xffff, vendor);
-
- table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
- config_tables = early_memremap(efi_to_phys(efi.systab->tables),
- table_size);
- if (config_tables == NULL) {
- pr_warn("Unable to map EFI config table array.\n");
- retval = -ENOMEM;
- goto out;
- }
- retval = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
- sizeof(efi_config_table_64_t), NULL);
-
- early_memunmap(config_tables, table_size);
-out:
- early_memunmap(efi.systab, sizeof(efi_system_table_t));
- return retval;
-}
-
-/*
- * Return true for RAM regions we want to permanently reserve.
- */
-static __init int is_reserve_region(efi_memory_desc_t *md)
-{
- switch (md->type) {
- case EFI_LOADER_CODE:
- case EFI_LOADER_DATA:
- case EFI_BOOT_SERVICES_CODE:
- case EFI_BOOT_SERVICES_DATA:
- case EFI_CONVENTIONAL_MEMORY:
- case EFI_PERSISTENT_MEMORY:
- return 0;
- default:
- break;
- }
- return is_normal_ram(md);
-}
-
-static __init void reserve_regions(void)
-{
- efi_memory_desc_t *md;
- u64 paddr, npages, size;
-
- if (efi_enabled(EFI_DBG))
- pr_info("Processing EFI memory map:\n");
-
- for_each_efi_memory_desc(&memmap, md) {
- paddr = md->phys_addr;
- npages = md->num_pages;
-
- if (efi_enabled(EFI_DBG)) {
- char buf[64];
-
- pr_info(" 0x%012llx-0x%012llx %s",
- paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
- efi_md_typeattr_format(buf, sizeof(buf), md));
- }
-
- memrange_efi_to_native(&paddr, &npages);
- size = npages << PAGE_SHIFT;
-
- if (is_normal_ram(md))
- early_init_dt_add_memory_arch(paddr, size);
-
- if (is_reserve_region(md)) {
- memblock_reserve(paddr, size);
- if (efi_enabled(EFI_DBG))
- pr_cont("*");
- }
-
- if (efi_enabled(EFI_DBG))
- pr_cont("\n");
- }
-
- set_bit(EFI_MEMMAP, &efi.flags);
-}
-
-void __init efi_init(void)
-{
- struct efi_fdt_params params;
-
- /* Grab UEFI information placed in FDT by stub */
- if (!efi_get_fdt_params(¶ms))
- return;
-
- efi_system_table = params.system_table;
-
- memblock_reserve(params.mmap & PAGE_MASK,
- PAGE_ALIGN(params.mmap_size + (params.mmap & ~PAGE_MASK)));
- memmap.phys_map = params.mmap;
- memmap.map = early_memremap(params.mmap, params.mmap_size);
- if (memmap.map == NULL) {
- /*
- * If we are booting via UEFI, the UEFI memory map is the only
- * description of memory we have, so there is little point in
- * proceeding if we cannot access it.
- */
- panic("Unable to map EFI memory map.\n");
- }
- memmap.map_end = memmap.map + params.mmap_size;
- memmap.desc_size = params.desc_size;
- memmap.desc_version = params.desc_ver;
-
- if (uefi_init() < 0)
- return;
-
- reserve_regions();
- early_memunmap(memmap.map, params.mmap_size);
-}
-
-static bool __init efi_virtmap_init(void)
-{
- efi_memory_desc_t *md;
-
- init_new_context(NULL, &efi_mm);
-
- for_each_efi_memory_desc(&memmap, md) {
- pgprot_t prot;
-
- if (!(md->attribute & EFI_MEMORY_RUNTIME))
- continue;
- if (md->virt_addr == 0)
- return false;
-
- pr_info(" EFI remap 0x%016llx => %p\n",
- md->phys_addr, (void *)md->virt_addr);
-
- /*
- * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
- * executable, everything else can be mapped with the XN bits
- * set.
- */
- if (!is_normal_ram(md))
- prot = __pgprot(PROT_DEVICE_nGnRE);
- else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
- !PAGE_ALIGNED(md->phys_addr))
- prot = PAGE_KERNEL_EXEC;
- else
- prot = PAGE_KERNEL;
-
- create_pgd_mapping(&efi_mm, md->phys_addr, md->virt_addr,
- md->num_pages << EFI_PAGE_SHIFT,
- __pgprot(pgprot_val(prot) | PTE_NG));
- }
- return true;
-}
-
-/*
- * Enable the UEFI Runtime Services if all prerequisites are in place, i.e.,
- * non-early mapping of the UEFI system table and virtual mappings for all
- * EFI_MEMORY_RUNTIME regions.
- */
-static int __init arm64_enable_runtime_services(void)
-{
- u64 mapsize;
-
- if (!efi_enabled(EFI_BOOT)) {
- pr_info("EFI services will not be available.\n");
- return 0;
- }
-
- if (efi_runtime_disabled()) {
- pr_info("EFI runtime services will be disabled.\n");
- return 0;
- }
-
- pr_info("Remapping and enabling EFI services.\n");
-
- mapsize = memmap.map_end - memmap.map;
- memmap.map = (__force void *)ioremap_cache(memmap.phys_map,
- mapsize);
- if (!memmap.map) {
- pr_err("Failed to remap EFI memory map\n");
- return -ENOMEM;
- }
- memmap.map_end = memmap.map + mapsize;
- efi.memmap = &memmap;
-
- efi.systab = (__force void *)ioremap_cache(efi_system_table,
- sizeof(efi_system_table_t));
- if (!efi.systab) {
- pr_err("Failed to remap EFI System Table\n");
- return -ENOMEM;
- }
- set_bit(EFI_SYSTEM_TABLES, &efi.flags);
-
- if (!efi_virtmap_init()) {
- pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
- return -ENOMEM;
- }
-
- /* Set up runtime services function pointers */
- efi_native_runtime_setup();
- set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
-
- efi.runtime_version = efi.systab->hdr.revision;
-
+ create_pgd_mapping(mm, md->phys_addr, md->virt_addr,
+ md->num_pages << EFI_PAGE_SHIFT,
+ __pgprot(prot_val | PTE_NG));
return 0;
}
-early_initcall(arm64_enable_runtime_services);
static int __init arm64_dmi_init(void)
{
@@ -337,23 +54,6 @@
}
core_initcall(arm64_dmi_init);
-static void efi_set_pgd(struct mm_struct *mm)
-{
- switch_mm(NULL, mm, NULL);
-}
-
-void efi_virtmap_load(void)
-{
- preempt_disable();
- efi_set_pgd(&efi_mm);
-}
-
-void efi_virtmap_unload(void)
-{
- efi_set_pgd(current->active_mm);
- preempt_enable();
-}
-
/*
* UpdateCapsule() depends on the system being shutdown via
* ResetSystem().
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index dccd0c2..4ed652a 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -27,8 +27,12 @@
#include <asm/cpufeature.h>
#include <asm/errno.h>
#include <asm/esr.h>
+#include <asm/irq.h>
#include <asm/memory.h>
+#include <asm/mmu.h>
+#include <asm/ptrace.h>
#include <asm/thread_info.h>
+#include <asm/uaccess.h>
#include <asm/asm-uaccess.h>
#include <asm/unistd.h>
@@ -67,8 +71,31 @@
#define BAD_FIQ 2
#define BAD_ERROR 3
- .macro kernel_entry, el, regsize = 64
+ .macro kernel_ventry, el, label, regsize = 64
+ .align 7
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+alternative_if ARM64_UNMAP_KERNEL_AT_EL0
+ .if \el == 0
+ .if \regsize == 64
+ mrs x30, tpidrro_el0
+ msr tpidrro_el0, xzr
+ .else
+ mov x30, xzr
+ .endif
+ .endif
+alternative_else_nop_endif
+#endif
+
sub sp, sp, #S_FRAME_SIZE
+ b el\()\el\()_\label
+ .endm
+
+ .macro tramp_alias, dst, sym
+ mov_q \dst, TRAMP_VALIAS
+ add \dst, \dst, #(\sym - .entry.tramp.text)
+ .endm
+
+ .macro kernel_entry, el, regsize = 64
.if \regsize == 32
mov w0, w0 // zero upper 32 bits of x0
.endif
@@ -90,21 +117,64 @@
.if \el == 0
mrs x21, sp_el0
- get_thread_info tsk // Ensure MDSCR_EL1.SS is clear,
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr_this_cpu tsk, __entry_task, x20 // Ensure MDSCR_EL1.SS is clear,
+ ldr x19, [tsk, #TSK_TI_FLAGS] // since we can unmask debug
+#else
+ mov tsk, sp
+ and tsk, tsk, #~(THREAD_SIZE - 1) // Ensure MDSCR_EL1.SS is clear,
ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug
+#endif
disable_step_tsk x19, x20 // exceptions when scheduling.
+
+ mov x29, xzr // fp pointed to user-space
.else
add x21, sp, #S_FRAME_SIZE
get_thread_info tsk
/* Save the task's original addr_limit and set USER_DS (TASK_SIZE_64) */
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x20, [tsk, #TSK_TI_ADDR_LIMIT]
+#else
ldr x20, [tsk, #TI_ADDR_LIMIT]
+#endif
str x20, [sp, #S_ORIG_ADDR_LIMIT]
mov x20, #TASK_SIZE_64
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ str x20, [tsk, #TSK_TI_ADDR_LIMIT]
+#else
str x20, [tsk, #TI_ADDR_LIMIT]
+#endif
+ ALTERNATIVE(nop, SET_PSTATE_UAO(0), ARM64_HAS_UAO, CONFIG_ARM64_UAO)
.endif /* \el == 0 */
mrs x22, elr_el1
mrs x23, spsr_el1
stp lr, x21, [sp, #S_LR]
+
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ /*
+ * Set the TTBR0 PAN bit in SPSR. When the exception is taken from
+ * EL0, there is no need to check the state of TTBR0_EL1 since
+ * accesses are always enabled.
+ * Note that the meaning of this bit differs from the ARMv8.1 PAN
+ * feature as all TTBR0_EL1 accesses are disabled, not just those to
+ * user mappings.
+ */
+alternative_if ARM64_HAS_PAN
+ b 1f // skip TTBR0 PAN
+alternative_else_nop_endif
+
+ .if \el != 0
+ mrs x21, ttbr0_el1
+ tst x21, #TTBR_ASID_MASK // Check for the reserved ASID
+ orr x23, x23, #PSR_PAN_BIT // Set the emulated PAN in the saved SPSR
+ b.eq 1f // TTBR0 access already disabled
+ and x23, x23, #~PSR_PAN_BIT // Clear the emulated PAN in the saved SPSR
+ .endif
+
+ __uaccess_ttbr0_disable x21
+1:
+#endif
+
stp x22, x23, [sp, #S_PC]
/*
@@ -116,6 +186,13 @@
.endif
/*
+ * Set sp_el0 to current thread_info.
+ */
+ .if \el == 0
+ msr sp_el0, tsk
+ .endif
+
+ /*
* Registers that may be useful after this macro is invoked:
*
* x21 - aborted SP
@@ -128,33 +205,70 @@
.if \el != 0
/* Restore the task's original addr_limit. */
ldr x20, [sp, #S_ORIG_ADDR_LIMIT]
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ str x20, [tsk, #TSK_TI_ADDR_LIMIT]
+#else
str x20, [tsk, #TI_ADDR_LIMIT]
+#endif
+
+ /* No need to restore UAO, it will be restored from SPSR_EL1 */
.endif
ldp x21, x22, [sp, #S_PC] // load ELR, SPSR
.if \el == 0
ct_user_enter
+ .endif
+
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+ /*
+ * Restore access to TTBR0_EL1. If returning to EL0, no need for SPSR
+ * PAN bit checking.
+ */
+alternative_if ARM64_HAS_PAN
+ b 2f // skip TTBR0 PAN
+alternative_else_nop_endif
+
+ .if \el != 0
+ tbnz x22, #22, 1f // Skip re-enabling TTBR0 access if the PSR_PAN_BIT is set
+ .endif
+
+ __uaccess_ttbr0_enable x0, x1
+
+ .if \el == 0
+ /*
+ * Enable errata workarounds only if returning to user. The only
+ * workaround currently required for TTBR0_EL1 changes are for the
+ * Cavium erratum 27456 (broadcast TLBI instructions may cause I-cache
+ * corruption).
+ */
+ bl post_ttbr_update_workaround
+ .endif
+1:
+ .if \el != 0
+ and x22, x22, #~PSR_PAN_BIT // ARMv8.0 CPUs do not understand this bit
+ .endif
+2:
+#endif
+
+ .if \el == 0
ldr x23, [sp, #S_SP] // load return stack pointer
msr sp_el0, x23
+ tst x22, #PSR_MODE32_BIT // native task?
+ b.eq 3f
+
#ifdef CONFIG_ARM64_ERRATUM_845719
-alternative_if_not ARM64_WORKAROUND_845719
- nop
- nop
-#ifdef CONFIG_PID_IN_CONTEXTIDR
- nop
-#endif
-alternative_else
- tbz x22, #4, 1f
+alternative_if ARM64_WORKAROUND_845719
#ifdef CONFIG_PID_IN_CONTEXTIDR
mrs x29, contextidr_el1
msr contextidr_el1, x29
#else
msr contextidr_el1, xzr
#endif
-1:
-alternative_endif
+alternative_else_nop_endif
#endif
+3:
.endif
+
msr elr_el1, x21 // set up the return data
msr spsr_el1, x22
ldp x0, x1, [sp, #16 * 0]
@@ -174,12 +288,65 @@
ldp x28, x29, [sp, #16 * 14]
ldr lr, [sp, #S_LR]
add sp, sp, #S_FRAME_SIZE // restore sp
- eret // return to kernel
+
+ .if \el == 0
+alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ bne 4f
+ msr far_el1, x30
+ tramp_alias x30, tramp_exit_native
+ br x30
+4:
+ tramp_alias x30, tramp_exit_compat
+ br x30
+#endif
+ .else
+ eret
+ .endif
.endm
- .macro get_thread_info, rd
- mov \rd, sp
- and \rd, \rd, #~(THREAD_SIZE - 1) // top of stack
+ .macro irq_stack_entry
+ mov x19, sp // preserve the original sp
+
+ /*
+ * Compare sp with the base of the task stack.
+ * If the top ~(THREAD_SIZE - 1) bits match, we are on a task stack,
+ * and should switch to the irq stack.
+ */
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x25, [tsk, TSK_STACK]
+ eor x25, x25, x19
+ and x25, x25, #~(THREAD_SIZE - 1)
+ cbnz x25, 9998f
+#else
+ and x25, x19, #~(THREAD_SIZE - 1)
+ cmp x25, tsk
+ b.ne 9998f
+#endif
+
+ adr_this_cpu x25, irq_stack, x26
+ mov x26, #IRQ_STACK_START_SP
+ add x26, x25, x26
+
+ /* switch to the irq stack */
+ mov sp, x26
+
+ /*
+ * Add a dummy stack frame, this non-standard format is fixed up
+ * by unwind_frame()
+ */
+ stp x29, x19, [sp, #-16]!
+ mov x29, sp
+
+9998:
+ .endm
+
+ /*
+ * x19 should be preserved between irq_stack_entry and
+ * irq_stack_exit.
+ */
+ .macro irq_stack_exit
+ mov sp, x19
.endm
/*
@@ -197,10 +364,11 @@
* Interrupt handling.
*/
.macro irq_handler
- adrp x1, handle_arch_irq
- ldr x1, [x1, #:lo12:handle_arch_irq]
+ ldr_l x1, handle_arch_irq
mov x0, sp
+ irq_stack_entry
blr x1
+ irq_stack_exit
.endm
.text
@@ -211,31 +379,31 @@
.align 11
ENTRY(vectors)
- ventry el1_sync_invalid // Synchronous EL1t
- ventry el1_irq_invalid // IRQ EL1t
- ventry el1_fiq_invalid // FIQ EL1t
- ventry el1_error_invalid // Error EL1t
+ kernel_ventry 1, sync_invalid // Synchronous EL1t
+ kernel_ventry 1, irq_invalid // IRQ EL1t
+ kernel_ventry 1, fiq_invalid // FIQ EL1t
+ kernel_ventry 1, error_invalid // Error EL1t
- ventry el1_sync // Synchronous EL1h
- ventry el1_irq // IRQ EL1h
- ventry el1_fiq_invalid // FIQ EL1h
- ventry el1_error_invalid // Error EL1h
+ kernel_ventry 1, sync // Synchronous EL1h
+ kernel_ventry 1, irq // IRQ EL1h
+ kernel_ventry 1, fiq_invalid // FIQ EL1h
+ kernel_ventry 1, error_invalid // Error EL1h
- ventry el0_sync // Synchronous 64-bit EL0
- ventry el0_irq // IRQ 64-bit EL0
- ventry el0_fiq_invalid // FIQ 64-bit EL0
- ventry el0_error_invalid // Error 64-bit EL0
+ kernel_ventry 0, sync // Synchronous 64-bit EL0
+ kernel_ventry 0, irq // IRQ 64-bit EL0
+ kernel_ventry 0, fiq_invalid // FIQ 64-bit EL0
+ kernel_ventry 0, error_invalid // Error 64-bit EL0
#ifdef CONFIG_COMPAT
- ventry el0_sync_compat // Synchronous 32-bit EL0
- ventry el0_irq_compat // IRQ 32-bit EL0
- ventry el0_fiq_invalid_compat // FIQ 32-bit EL0
- ventry el0_error_invalid_compat // Error 32-bit EL0
+ kernel_ventry 0, sync_compat, 32 // Synchronous 32-bit EL0
+ kernel_ventry 0, irq_compat, 32 // IRQ 32-bit EL0
+ kernel_ventry 0, fiq_invalid_compat, 32 // FIQ 32-bit EL0
+ kernel_ventry 0, error_invalid_compat, 32 // Error 32-bit EL0
#else
- ventry el0_sync_invalid // Synchronous 32-bit EL0
- ventry el0_irq_invalid // IRQ 32-bit EL0
- ventry el0_fiq_invalid // FIQ 32-bit EL0
- ventry el0_error_invalid // Error 32-bit EL0
+ kernel_ventry 0, sync_invalid, 32 // Synchronous 32-bit EL0
+ kernel_ventry 0, irq_invalid, 32 // IRQ 32-bit EL0
+ kernel_ventry 0, fiq_invalid, 32 // FIQ 32-bit EL0
+ kernel_ventry 0, error_invalid, 32 // Error 32-bit EL0
#endif
END(vectors)
@@ -243,7 +411,7 @@
* Invalid mode handlers
*/
.macro inv_entry, el, reason, regsize = 64
- kernel_entry el, \regsize
+ kernel_entry \el, \regsize
mov x0, sp
mov x1, #\reason
mrs x2, esr_el1
@@ -302,6 +470,8 @@
lsr x24, x1, #ESR_ELx_EC_SHIFT // exception class
cmp x24, #ESR_ELx_EC_DABT_CUR // data abort in EL1
b.eq el1_da
+ cmp x24, #ESR_ELx_EC_IABT_CUR // instruction abort in EL1
+ b.eq el1_ia
cmp x24, #ESR_ELx_EC_SYS64 // configurable trap
b.eq el1_undef
cmp x24, #ESR_ELx_EC_SP_ALIGN // stack alignment exception
@@ -313,6 +483,11 @@
cmp x24, #ESR_ELx_EC_BREAKPT_CUR // debug exception in EL1
b.ge el1_dbg
b el1_inv
+
+el1_ia:
+ /*
+ * Fall through to the Data abort case
+ */
el1_da:
/*
* Data abort handling
@@ -376,10 +551,17 @@
irq_handler
#ifdef CONFIG_PREEMPT
- get_thread_info tsk
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr w24, [tsk, #TSK_TI_PREEMPT] // get preempt count
+#else
ldr w24, [tsk, #TI_PREEMPT] // get preempt count
+#endif
cbnz w24, 1f // preempt count != 0
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x0, [tsk, #TSK_TI_FLAGS] // get flags
+#else
ldr x0, [tsk, #TI_FLAGS] // get flags
+#endif
tbz x0, #TIF_NEED_RESCHED, 1f // needs rescheduling?
bl el1_preempt
1:
@@ -394,7 +576,11 @@
el1_preempt:
mov x24, lr
1: bl preempt_schedule_irq // irq en/disable is done inside
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x0, [tsk, #TSK_TI_FLAGS] // get new tasks TI_FLAGS
+#else
ldr x0, [tsk, #TI_FLAGS] // get new tasks TI_FLAGS
+#endif
tbnz x0, #TIF_NEED_RESCHED, 1b // needs rescheduling?
ret x24
#endif
@@ -418,7 +604,7 @@
cmp x24, #ESR_ELx_EC_FP_EXC64 // FP/ASIMD exception
b.eq el0_fpsimd_exc
cmp x24, #ESR_ELx_EC_SYS64 // configurable trap
- b.eq el0_undef
+ b.eq el0_sys
cmp x24, #ESR_ELx_EC_SP_ALIGN // stack alignment exception
b.eq el0_sp_pc
cmp x24, #ESR_ELx_EC_PC_ALIGN // pc alignment exception
@@ -499,7 +685,7 @@
enable_dbg_and_irq
ct_user_exit
mov x0, x26
- orr x1, x25, #1 << 24 // use reserved ISS bit for instruction aborts
+ mov x1, x25
mov x2, sp
bl do_mem_abort
b ret_to_user
@@ -546,6 +732,16 @@
mov x0, sp
bl do_undefinstr
b ret_to_user
+el0_sys:
+ /*
+ * System instructions, for trapped cache maintenance instructions
+ */
+ enable_dbg_and_irq
+ ct_user_exit
+ mov x0, x25
+ mov x1, sp
+ bl do_sysinstr
+ b ret_to_user
el0_dbg:
/*
* Debug exception handling
@@ -614,6 +810,12 @@
ldp x29, x9, [x8], #16
ldr lr, [x8]
mov sp, x9
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ msr sp_el0, x1
+#else
+ and x9, x9, #~(THREAD_SIZE - 1)
+ msr sp_el0, x9
+#endif
ret
ENDPROC(cpu_switch_to)
@@ -624,7 +826,11 @@
ret_fast_syscall:
disable_irq // disable interrupts
str x0, [sp, #S_X0] // returned x0
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
+#else
ldr x1, [tsk, #TI_FLAGS] // re-check for syscall tracing
+#endif
and x2, x1, #_TIF_SYSCALL_WORK
cbnz x2, ret_fast_syscall_trace
and x2, x1, #_TIF_WORK_MASK
@@ -641,14 +847,14 @@
work_pending:
tbnz x1, #TIF_NEED_RESCHED, work_resched
/* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
- ldr x2, [sp, #S_PSTATE]
mov x0, sp // 'regs'
- tst x2, #PSR_MODE_MASK // user mode regs?
- b.ne no_work_pending // returning to kernel
enable_irq // enable interrupts for do_notify_resume()
bl do_notify_resume
b ret_to_user
work_resched:
+#ifdef CONFIG_TRACE_IRQFLAGS
+ bl trace_hardirqs_off // the IRQs are off here, inform the tracing code
+#endif
bl schedule
/*
@@ -656,11 +862,14 @@
*/
ret_to_user:
disable_irq // disable interrupts
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x1, [tsk, #TSK_TI_FLAGS]
+#else
ldr x1, [tsk, #TI_FLAGS]
+#endif
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
enable_step_tsk x1, x2
-no_work_pending:
kernel_exit 0
ENDPROC(ret_to_user)
@@ -689,7 +898,11 @@
enable_dbg_and_irq
ct_user_exit 1
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ ldr x16, [tsk, #TSK_TI_FLAGS] // check for syscall hooks
+#else
ldr x16, [tsk, #TI_FLAGS] // check for syscall hooks
+#endif
tst x16, #_TIF_SYSCALL_WORK
b.ne __sys_trace
cmp scno, sc_nr // check upper syscall limit
@@ -740,6 +953,119 @@
bl do_ni_syscall
b __sys_trace_return
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+/*
+ * Exception vectors trampoline.
+ */
+ .pushsection ".entry.tramp.text", "ax"
+
+ .macro tramp_map_kernel, tmp
+ mrs \tmp, ttbr1_el1
+ sub \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+ bic \tmp, \tmp, #USER_ASID_FLAG
+ msr ttbr1_el1, \tmp
+#ifdef CONFIG_ARCH_MSM8996
+ /* ASID already in \tmp[63:48] */
+ movk \tmp, #:abs_g2_nc:(TRAMP_VALIAS >> 12)
+ movk \tmp, #:abs_g1_nc:(TRAMP_VALIAS >> 12)
+ /* 2MB boundary containing the vectors, so we nobble the walk cache */
+ movk \tmp, #:abs_g0_nc:((TRAMP_VALIAS & ~(SZ_2M - 1)) >> 12)
+ isb
+ tlbi vae1, \tmp
+ dsb nsh
+#endif /* CONFIG_ARCH_MSM8996 */
+ .endm
+
+ .macro tramp_unmap_kernel, tmp
+ mrs \tmp, ttbr1_el1
+ add \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+ orr \tmp, \tmp, #USER_ASID_FLAG
+ msr ttbr1_el1, \tmp
+ /*
+ * We avoid running the post_ttbr_update_workaround here because the
+ * user and kernel ASIDs don't have conflicting mappings, so any
+ * "blessing" as described in:
+ *
+ * http://lkml.kernel.org/r/56BB848A.6060603@caviumnetworks.com
+ *
+ * will not hurt correctness. Whilst this may partially defeat the
+ * point of using split ASIDs in the first place, it avoids
+ * the hit of invalidating the entire I-cache on every return to
+ * userspace.
+ */
+ .endm
+
+ .macro tramp_ventry, regsize = 64
+ .align 7
+1:
+ .if \regsize == 64
+ msr tpidrro_el0, x30 // Restored in kernel_ventry
+ .endif
+ bl 2f
+ b .
+2:
+ tramp_map_kernel x30
+#ifdef CONFIG_RANDOMIZE_BASE
+ adr x30, tramp_vectors + PAGE_SIZE
+#ifndef CONFIG_ARCH_MSM8996
+ isb
+#endif
+ ldr x30, [x30]
+#else
+ ldr x30, =vectors
+#endif
+ prfm plil1strm, [x30, #(1b - tramp_vectors)]
+ msr vbar_el1, x30
+ add x30, x30, #(1b - tramp_vectors)
+ isb
+ ret
+ .endm
+
+ .macro tramp_exit, regsize = 64
+ adr x30, tramp_vectors
+ msr vbar_el1, x30
+ tramp_unmap_kernel x30
+ .if \regsize == 64
+ mrs x30, far_el1
+ .endif
+ eret
+ .endm
+
+ .align 11
+ENTRY(tramp_vectors)
+ .space 0x400
+
+ tramp_ventry
+ tramp_ventry
+ tramp_ventry
+ tramp_ventry
+
+ tramp_ventry 32
+ tramp_ventry 32
+ tramp_ventry 32
+ tramp_ventry 32
+END(tramp_vectors)
+
+ENTRY(tramp_exit_native)
+ tramp_exit
+END(tramp_exit_native)
+
+ENTRY(tramp_exit_compat)
+ tramp_exit 32
+END(tramp_exit_compat)
+
+ .ltorg
+ .popsection // .entry.tramp.text
+#ifdef CONFIG_RANDOMIZE_BASE
+ .pushsection ".rodata", "a"
+ .align PAGE_SHIFT
+ .globl __entry_tramp_data_start
+__entry_tramp_data_start:
+ .quad vectors
+ .popsection // .rodata
+#endif /* CONFIG_RANDOMIZE_BASE */
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
/*
* Special system call wrappers.
*/
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 6638903..f995dae 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -291,7 +291,7 @@
.notifier_call = fpsimd_cpu_pm_notifier,
};
-static void fpsimd_pm_init(void)
+static void __init fpsimd_pm_init(void)
{
cpu_pm_register_notifier(&fpsimd_cpu_pm_notifier_block);
}
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index c851be7..ebecf9a 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -29,12 +29,11 @@
/*
* Note:
- * Due to modules and __init, code can disappear and change,
- * we need to protect against faulting as well as code changing.
- * We do this by aarch64_insn_*() which use the probe_kernel_*().
- *
- * No lock is held here because all the modifications are run
- * through stop_machine().
+ * We are paranoid about modifying text, as if a bug were to happen, it
+ * could cause us to read or write to someplace that could cause harm.
+ * Carefully read and modify the code with aarch64_insn_*() which uses
+ * probe_kernel_*(), and make sure what we read is what we expected it
+ * to be before modifying it.
*/
if (validate) {
if (aarch64_insn_read((void *)pc, &replaced))
@@ -93,6 +92,11 @@
return ftrace_modify_code(pc, old, new, true);
}
+void arch_ftrace_update_code(int command)
+{
+ ftrace_modify_all_code(command);
+}
+
int __init ftrace_dyn_arch_init(void)
{
return 0;
@@ -125,23 +129,20 @@
* on other archs. It's unlikely on AArch64.
*/
old = *parent;
- *parent = return_hooker;
trace.func = self_addr;
trace.depth = current->curr_ret_stack + 1;
/* Only trace if the calling function expects to */
- if (!ftrace_graph_entry(&trace)) {
- *parent = old;
+ if (!ftrace_graph_entry(&trace))
return;
- }
err = ftrace_push_return_trace(old, self_addr, &trace.depth,
frame_pointer);
- if (err == -EBUSY) {
- *parent = old;
+ if (err == -EBUSY)
return;
- }
+ else
+ *parent = return_hooker;
}
#ifdef CONFIG_DYNAMIC_FTRACE
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d019c3a..5dfc0ff 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -25,10 +25,12 @@
#include <linux/irqchip/arm-gic-v3.h>
#include <asm/assembler.h>
+#include <asm/boot.h>
#include <asm/ptrace.h>
#include <asm/asm-offsets.h>
#include <asm/cache.h>
#include <asm/cputype.h>
+#include <asm/elf.h>
#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
@@ -67,12 +69,11 @@
* in the entry routines.
*/
__HEAD
-
+_head:
/*
* DO NOT MODIFY. Image header expected by Linux boot-loaders.
*/
#ifdef CONFIG_EFI
-efi_head:
/*
* This add instruction has no meaningful effect except that
* its opcode forms the magic "MZ" signature required by UEFI.
@@ -83,9 +84,9 @@
b stext // branch to kernel start, magic
.long 0 // reserved
#endif
- .quad _kernel_offset_le // Image load offset from start of RAM, little-endian
- .quad _kernel_size_le // Effective size of kernel image, little-endian
- .quad _kernel_flags_le // Informative flags, little-endian
+ le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
+ le64sym _kernel_size_le // Effective size of kernel image, little-endian
+ le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
@@ -94,14 +95,14 @@
.byte 0x4d
.byte 0x64
#ifdef CONFIG_EFI
- .long pe_header - efi_head // Offset to the PE header.
+ .long pe_header - _head // Offset to the PE header.
#else
.word 0 // reserved
#endif
#ifdef CONFIG_EFI
.globl __efistub_stext_offset
- .set __efistub_stext_offset, stext - efi_head
+ .set __efistub_stext_offset, stext - _head
.align 3
pe_header:
.ascii "PE"
@@ -124,7 +125,7 @@
.long _end - stext // SizeOfCode
.long 0 // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
- .long __efistub_entry - efi_head // AddressOfEntryPoint
+ .long __efistub_entry - _head // AddressOfEntryPoint
.long __efistub_stext_offset // BaseOfCode
extra_header_fields:
@@ -139,7 +140,7 @@
.short 0 // MinorSubsystemVersion
.long 0 // Win32VersionValue
- .long _end - efi_head // SizeOfImage
+ .long _end - _head // SizeOfImage
// Everything before the kernel image is considered part of the header
.long __efistub_stext_offset // SizeOfHeaders
@@ -211,6 +212,7 @@
bl preserve_boot_args
bl el2_setup // Drop to EL1, w20=cpu_boot_mode
adrp x24, __PHYS_OFFSET
+ and x23, x24, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
bl set_cpu_boot_mode_flag
bl __create_page_tables // x25=TTBR0, x26=TTBR1
/*
@@ -219,11 +221,13 @@
* On return, the CPU will be ready for the MMU to be turned on and
* the TCR will have been set.
*/
- ldr x27, =__mmap_switched // address to jump to after
+ ldr x27, 0f // address to jump to after
// MMU has been enabled
adr_l lr, __enable_mmu // return (PIC) address
b __cpu_setup // initialise processor
ENDPROC(stext)
+ .align 3
+0: .quad __mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
@@ -311,21 +315,21 @@
__create_page_tables:
adrp x25, idmap_pg_dir
adrp x26, swapper_pg_dir
- mov x27, lr
+ mov x28, lr
/*
* Invalidate the idmap and swapper page tables to avoid potential
* dirty cache lines being evicted.
*/
mov x0, x25
- add x1, x26, #SWAPPER_DIR_SIZE
+ add x1, x26, #SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
bl __inval_cache_range
/*
* Clear the idmap and swapper page tables.
*/
mov x0, x25
- add x6, x26, #SWAPPER_DIR_SIZE
+ add x6, x26, #SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
1: stp xzr, xzr, [x0], #16
stp xzr, xzr, [x0], #16
stp xzr, xzr, [x0], #16
@@ -389,9 +393,11 @@
* Map the kernel image (starting with PHYS_OFFSET).
*/
mov x0, x26 // swapper_pg_dir
- mov x5, #PAGE_OFFSET
+ ldr x5, =KIMAGE_VADDR
+ add x5, x5, x23 // add KASLR displacement
create_pgd_entry x0, x5, x3, x6
- ldr x6, =KERNEL_END // __va(KERNEL_END)
+ ldr w6, kernel_img_size
+ add x6, x6, x5
mov x3, x24 // phys offset
create_block_map x0, x7, x3, x5, x6
@@ -401,13 +407,15 @@
* tables again to remove any speculatively loaded cache lines.
*/
mov x0, x25
- add x1, x26, #SWAPPER_DIR_SIZE
+ add x1, x26, #SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE
dmb sy
bl __inval_cache_range
- mov lr, x27
- ret
+ ret x28
ENDPROC(__create_page_tables)
+
+kernel_img_size:
+ .long _end - (_head - TEXT_OFFSET)
.ltorg
/*
@@ -415,21 +423,92 @@
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
- adr_l x6, __bss_start
- adr_l x7, __bss_stop
+ mov x28, lr // preserve LR
-1: cmp x6, x7
+ adr_l x8, vectors // load VBAR_EL1 with virtual
+ msr vbar_el1, x8 // vector table address
+ isb
+
+ // Clear BSS
+ adr_l x0, __bss_start
+ mov x1, xzr
+ adr_l x2, __bss_stop
+ sub x2, x2, x0
+ bl __pi_memset
+ dsb ishst // Make zero page visible to PTW
+
+#ifdef CONFIG_RELOCATABLE
+
+ /*
+ * Iterate over each entry in the relocation table, and apply the
+ * relocations in place.
+ */
+ adr_l x8, __dynsym_start // start of symbol table
+ adr_l x9, __reloc_start // start of reloc table
+ adr_l x10, __reloc_end // end of reloc table
+
+0: cmp x9, x10
b.hs 2f
- str xzr, [x6], #8 // Clear BSS
- b 1b
-2:
+ ldp x11, x12, [x9], #24
+ ldr x13, [x9, #-8]
+ cmp w12, #R_AARCH64_RELATIVE
+ b.ne 1f
+ add x13, x13, x23 // relocate
+ str x13, [x11, x23]
+ b 0b
+
+1: cmp w12, #R_AARCH64_ABS64
+ b.ne 0b
+ add x12, x12, x12, lsl #1 // symtab offset: 24x top word
+ add x12, x8, x12, lsr #(32 - 3) // ... shifted into bottom word
+ ldrsh w14, [x12, #6] // Elf64_Sym::st_shndx
+ ldr x15, [x12, #8] // Elf64_Sym::st_value
+ cmp w14, #-0xf // SHN_ABS (0xfff1) ?
+ add x14, x15, x23 // relocate
+ csel x15, x14, x15, ne
+ add x15, x13, x15
+ str x15, [x11, x23]
+ b 0b
+
+2: adr_l x8, kimage_vaddr // make relocated kimage_vaddr
+ dc cvac, x8 // value visible to secondaries
+ dsb sy // with MMU off
+#endif
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ adrp x4, init_thread_union
+ add sp, x4, #THREAD_SIZE
+ adr_l x5, init_task
+ msr sp_el0, x5 // Save thread_info
+#else
adr_l sp, initial_sp, x4
+ mov x4, sp
+ and x4, x4, #~(THREAD_SIZE - 1)
+ msr sp_el0, x4 // Save thread_info
+#endif
+
str_l x21, __fdt_pointer, x5 // Save FDT pointer
- str_l x24, memstart_addr, x6 // Save PHYS_OFFSET
+
+ ldr_l x4, kimage_vaddr // Save the offset between
+ sub x4, x4, x24 // the kernel virtual and
+ str_l x4, kimage_voffset, x5 // physical mappings
+
mov x29, #0
#ifdef CONFIG_KASAN
bl kasan_early_init
#endif
+#ifdef CONFIG_RANDOMIZE_BASE
+ tst x23, ~(MIN_KIMG_ALIGN - 1) // already running randomized?
+ b.ne 0f
+ mov x0, x21 // pass FDT address in x0
+ mov x1, x23 // pass modulo offset in x1
+ bl kaslr_early_init // parse FDT for KASLR options
+ cbz x0, 0f // KASLR disabled? just proceed
+ orr x23, x23, x0 // record KASLR offset
+ ret x28 // we must enable KASLR, return
+ // to __enable_mmu()
+0:
+#endif
b start_kernel
ENDPROC(__mmap_switched)
@@ -438,6 +517,10 @@
* hotplug and needs to have the same protections as the text region
*/
.section ".text","ax"
+
+ENTRY(kimage_vaddr)
+ .quad _text - TEXT_OFFSET
+
/*
* If we're fortunate enough to boot at EL2, ensure that the world is
* sane before dropping to EL1.
@@ -605,14 +688,29 @@
adrp x26, swapper_pg_dir
bl __cpu_setup // initialise processor
- ldr x21, =secondary_data
- ldr x27, =__secondary_switched // address to jump to after enabling the MMU
+ ldr x8, kimage_vaddr
+ ldr w9, 0f
+ sub x27, x8, w9, sxtw // address to jump to after enabling the MMU
b __enable_mmu
ENDPROC(secondary_startup)
+0: .long (_text - TEXT_OFFSET) - __secondary_switched
ENTRY(__secondary_switched)
- ldr x0, [x21] // get secondary_data.stack
+ adr_l x5, vectors
+ msr vbar_el1, x5
+ isb
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+ adr_l x0, secondary_data
+ ldr x1, [x0, #CPU_BOOT_STACK] // get secondary_data.stack
+ mov sp, x1
+ ldr x2, [x0, #CPU_BOOT_TASK]
+ msr sp_el0, x2
+#else
+ ldr_l x0, secondary_data // get secondary_data.stack
mov sp, x0
+ and x0, x0, #~(THREAD_SIZE - 1)
+ msr sp_el0, x0 // save thread_info
+#endif
mov x29, #0
b secondary_start_kernel
ENDPROC(__secondary_switched)
@@ -630,12 +728,11 @@
*/
.section ".idmap.text", "ax"
__enable_mmu:
+ mrs x18, sctlr_el1 // preserve old SCTLR_EL1 value
mrs x1, ID_AA64MMFR0_EL1
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne __no_granule_support
- ldr x5, =vectors
- msr vbar_el1, x5
msr ttbr0_el1, x25 // load TTBR0
msr ttbr1_el1, x26 // load TTBR1
isb
@@ -649,6 +746,29 @@
ic iallu
dsb nsh
isb
+#ifdef CONFIG_RANDOMIZE_BASE
+ mov x19, x0 // preserve new SCTLR_EL1 value
+ blr x27
+
+ /*
+ * If we return here, we have a KASLR displacement in x23 which we need
+ * to take into account by discarding the current kernel mapping and
+ * creating a new one.
+ */
+ msr sctlr_el1, x18 // disable the MMU
+ isb
+ bl __create_page_tables // recreate kernel mapping
+
+ tlbi vmalle1 // Remove any stale TLB entries
+ dsb nsh
+
+ msr sctlr_el1, x19 // re-enable the MMU
+ isb
+ ic iallu // flush instructions fetched
+ dsb nsh // via old mapping
+ isb
+ add x27, x27, x23 // relocated __mmap_switched
+#endif
br x27
ENDPROC(__enable_mmu)
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index eeebfc3..629c002 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -314,9 +314,21 @@
case ARM_BREAKPOINT_LEN_2:
len_in_bytes = 2;
break;
+ case ARM_BREAKPOINT_LEN_3:
+ len_in_bytes = 3;
+ break;
case ARM_BREAKPOINT_LEN_4:
len_in_bytes = 4;
break;
+ case ARM_BREAKPOINT_LEN_5:
+ len_in_bytes = 5;
+ break;
+ case ARM_BREAKPOINT_LEN_6:
+ len_in_bytes = 6;
+ break;
+ case ARM_BREAKPOINT_LEN_7:
+ len_in_bytes = 7;
+ break;
case ARM_BREAKPOINT_LEN_8:
len_in_bytes = 8;
break;
@@ -346,7 +358,7 @@
* to generic breakpoint descriptions.
*/
int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
- int *gen_len, int *gen_type)
+ int *gen_len, int *gen_type, int *offset)
{
/* Type */
switch (ctrl.type) {
@@ -366,17 +378,33 @@
return -EINVAL;
}
+ if (!ctrl.len)
+ return -EINVAL;
+ *offset = __ffs(ctrl.len);
+
/* Len */
- switch (ctrl.len) {
+ switch (ctrl.len >> *offset) {
case ARM_BREAKPOINT_LEN_1:
*gen_len = HW_BREAKPOINT_LEN_1;
break;
case ARM_BREAKPOINT_LEN_2:
*gen_len = HW_BREAKPOINT_LEN_2;
break;
+ case ARM_BREAKPOINT_LEN_3:
+ *gen_len = HW_BREAKPOINT_LEN_3;
+ break;
case ARM_BREAKPOINT_LEN_4:
*gen_len = HW_BREAKPOINT_LEN_4;
break;
+ case ARM_BREAKPOINT_LEN_5:
+ *gen_len = HW_BREAKPOINT_LEN_5;
+ break;
+ case ARM_BREAKPOINT_LEN_6:
+ *gen_len = HW_BREAKPOINT_LEN_6;
+ break;
+ case ARM_BREAKPOINT_LEN_7:
+ *gen_len = HW_BREAKPOINT_LEN_7;
+ break;
case ARM_BREAKPOINT_LEN_8:
*gen_len = HW_BREAKPOINT_LEN_8;
break;
@@ -420,9 +448,21 @@
case HW_BREAKPOINT_LEN_2:
info->ctrl.len = ARM_BREAKPOINT_LEN_2;
break;
+ case HW_BREAKPOINT_LEN_3:
+ info->ctrl.len = ARM_BREAKPOINT_LEN_3;
+ break;
case HW_BREAKPOINT_LEN_4:
info->ctrl.len = ARM_BREAKPOINT_LEN_4;
break;
+ case HW_BREAKPOINT_LEN_5:
+ info->ctrl.len = ARM_BREAKPOINT_LEN_5;
+ break;
+ case HW_BREAKPOINT_LEN_6:
+ info->ctrl.len = ARM_BREAKPOINT_LEN_6;
+ break;
+ case HW_BREAKPOINT_LEN_7:
+ info->ctrl.len = ARM_BREAKPOINT_LEN_7;
+ break;
case HW_BREAKPOINT_LEN_8:
info->ctrl.len = ARM_BREAKPOINT_LEN_8;
break;
@@ -514,18 +554,17 @@
default:
return -EINVAL;
}
-
- info->address &= ~alignment_mask;
- info->ctrl.len <<= offset;
} else {
if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE)
alignment_mask = 0x3;
else
alignment_mask = 0x7;
- if (info->address & alignment_mask)
- return -EINVAL;
+ offset = info->address & alignment_mask;
}
+ info->address &= ~alignment_mask;
+ info->ctrl.len <<= offset;
+
/*
* Disallow per-task kernel breakpoints since these would
* complicate the stepping code.
@@ -656,12 +695,47 @@
return 0;
}
+/*
+ * Arm64 hardware does not always report a watchpoint hit address that matches
+ * one of the watchpoints set. It can also report an address "near" the
+ * watchpoint if a single instruction access both watched and unwatched
+ * addresses. There is no straight-forward way, short of disassembling the
+ * offending instruction, to map that address back to the watchpoint. This
+ * function computes the distance of the memory access from the watchpoint as a
+ * heuristic for the likelyhood that a given access triggered the watchpoint.
+ *
+ * See Section D2.10.5 "Determining the memory location that caused a Watchpoint
+ * exception" of ARMv8 Architecture Reference Manual for details.
<