SlideShare a Scribd company logo
Approach for CPUFreq
in Xen on ARM
Oleksandr Tyshchenko, Lead Software Engineer
EPAM Systems Inc.
2018
DEVELOPER AND
DESIGN SUMMITIntroduction
2
We are a team of developers at EPAM Systems Inc.,
based in Kyiv, Ukraine
We are focused on:
• Xen on ARM
• Renesas R-Car Gen3 platform support and maintenance in Xen
• Real-Time use-cases
• Automotive use-cases
• Para-virtual drivers and backends: sound, display, GPU
• SoC’s HW virtualization
• TEE integration
• Power management
• Android HALs
• FuSa ISO 61508/26262 certification
• We are members of GENIVI and AGL and pushing for Xen usage
there
• Yocto based build system for multi-domain distributions
We are upstreaming to Xen Project, Linux Kernel and many
OSS projects:
see us at https://github.com/xen-troops
DEVELOPER AND
DESIGN SUMMIT
2018
Agenda
1 W H AT I S I T A B O U T ?
2
3
4
5
C U R R E N T S TAT U S
P O S S I B L E A P P R O A C H E S
C P U F r e q P O C
B E N C H M A R K I N G R E S U LT S
DEVELOPER AND
DESIGN SUMMIT
6 I M P R O V E M E N T ( G U E S T I N P U T )
2018
DEVELOPER AND
DESIGN SUMMITWhat benefit is expected
from CPUFreq?
4
Xen becomes more popular in embedded world these days
and the demand for CPUFreq comes from this world.
There are two orthogonal key things the CPUFreq
can help with:
• Performance
Use of higher frequencies where required
− Get better performance on “heavy” use cases
− Speed up boot process
• Power savings
Use of lower frequencies where possible
− Save power on “light” use cases (improve the autonomy
of battery powered devices)
− Prevent possible CPU overheating which may reduce its life
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMITWhy Xen needs to be involved?
5
CPUFreq feature works quite good in Linux these days (a lot of
ARCHs are supported, including ARM).
So why don’t just pass through CPUFreq related stuff
to Hardware domain and let things work as they work
in bare Linux?
CPU virtualization is done by Xen.
Abstract guest doesn’t know anything about pCPUs because
it runs on vCPUs. It is not aware of:
• How many pCPUs the entire system has
• What is the CPU topology
• What is the actual CPU load
• How vCPUs it owns correspond to pCPUs
Obviously we can’t create a working solution without
modifying Xen as only it is the single system component
which has required information.
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
There are two possible modes
1. Domain 0 based CPUFReq
2. Hypervisor based CPUFReq
The “Hypervisor based CPUFReq” is more interesting and
appropriate to be used on ARM based embedded platforms
because the “Domain 0 based CPUFReq” implies:
• Keeping the number of vCPUs in Domain 0 equal to
the number of pCPUs and doing a vCPU pinning
(which is not always welcome)
• Having CPUFreq infrastructure, including HW drivers, in
Domain 0 (may conflict with “Thin Domain 0”
or “non-Linux Domain 0” requirements)
Current status
Xen on x86 already has CPUFreq support out
of the box. But, ARM support is missing so far.
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
It has all required components
• Core
• Governor
• Scaling driver
It supports two scaling drivers for x86
• ACPI Processor P-States Driver
• AMD Architectural P-state
It requires Domain 0 to be involved,
but not to pass requests to HW, just to
• Parse ACPI table and upload information to Xen
• Control CPUFreq parameters in run-time, collect statistic
It is ACPI specific...
Hypervisor based CPUFreq DEVELOPER AND
DESIGN SUMMIT
Picture from https://wiki.xenproject.org/wiki/Xen_power_management
2018
DEVELOPER AND
DESIGN SUMMIT
There was an attempt from Oleksandr Dmytryshyn
to create support for ARM.
Which included activity for making ACPI specific “Hypervisor
based CPUFreq” more generic.
He proposed split model, where frontend driver in Xen
interacts with the backend driver in Linux hardware domain
in order to scale pCPUs.
Linux patches (RFC V5)
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html
Xen patches (RCF V5)
https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html
* The pros and cons of this approach will be considered
in “Possible approaches” section.
ARM support is missing so far DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
The main problem is “frequency changing interface” in virtualized system.
CPUFreq core just makes a decision and issues platform dependent call which contains
the frequency for the CPU to run with.
Who should be in charge of physically setting new frequency?
• Xen
• Hardware domain
• Dedicated IP core (PM coprocessor)
The list of required components which are usually involved in Dynamic Voltage
and Frequency Scaling (DVFS) is quite big.
Also it may vary across different ARM platforms. For “Renesas R-Car Gen3” platforms they are:
• Generic clock, regulator, thermal frameworks
• Platform clock, PMIC, AVS, THS drivers
• I2C support, etc
• Definitely I missed something
The possible approaches we are about to consider differ exactly in frequency changing interface.
Possible approaches of CPUFreq on ARM
2018
DEVELOPER AND
DESIGN SUMMIT
Hardware domain scales pCPUs
Oleksandr Dmytryshyn proposed split model, where “hwdom-cpufreq” frontend driver in Xen interacts
with the “xen-cpufreq” backend driver in Linux hardware domain in order to scale pCPUs.
But this solution hasn’t been accepted by Xen community yet. Status of this enabling work is freezed
(there haven’t been any activities on this topic since 2014).
“Xen + hwdom” solution
P R O S
• The beauty of this approach is that we don’t need to port
a lot of DVFS specific things to Xen. The CPUFreq scaling
driver in Xen which doesn’t pass requests to HW but asks
hwdom about that is going to be a generic solution which
can be easily reusable by many ARM platforms.
C O N S
• The way how the “xen-cpufreq” backend driver and glue
layer were implemented won’t be acceptable by the Linux
community. Redesign is definitely needed.
• Complex communication interface between Xen and
hwdom: event channel, hypercalls, syscalls, custom DT
bindings in order to provide pCPU info to hwdom, etc. Still
unanswered major questions from Xen community
regarding synchronization.
• Isn’t a completely safe solution. Letting guest manage
device PM is one thing, but letting it manage one or more
pCPU is a big risk. Guest operating with HW may crash,
even the hardware domain. Malicious domain may power
down the whole system!
2018
DEVELOPER AND
DESIGN SUMMIT
Xen scales pCPUs
This implies that all DVFS specific things should be located in Xen. Obviously it is not supposed to be a copy of huge Linux frameworks,
in Xen it is supposed to be much simpler than in Linux, but it is quite clear that we will have to port all required support for the
CPUFreq scaling drivers in Xen to be able to actually realise certain OPPs.
* OPP - operation performance point is a tuple of frequency and minimum voltage.
P R O S
• Although non-generic, but safe and more architecturally
cleaner solution than “Xen+hwdom” one. Having all in Xen
we don’t depend neither on potentially malicious
domain’s behavior nor on buggy drivers running.
C O N S
• We are going to end up having a lot of code in Xen, since
we will have to keep as many CPUFreq scaling driver
implementations as many ARM platforms we will want to
support in Xen.
• Enormous developing effort is expected to get this
support in (the scope of required components looks huge)
and maintain it. What is more, it may be not even feasible
to implement this on some ARM platforms (separate CPU
clock, synchronization issues).
“All-in-Xen” solution
2018
DEVELOPER AND
DESIGN SUMMIT
It is yet another approach based on generic and popular at the moment
ARM SСMI(SCPI) protocol.
What SCMI protocol is?
The System Control and Management Interface (SCMI) protocol follows
the recent trend in industry to provide embedded microcontrollers in
systems to abstract various power and other system management tasks.
The protocol is supposed to be used between this embedded
microcontroller, which is the System Control Processor(SCP) in terms of
protocol and the Application Processors(AP). The mailbox feature
provides a mechanism for inter-processor communication (IPC).
Typically SCP provides a lot of services and one of these services is a
performance management which is the ability to control the
performance of a domain which is composed of compute engines such as
AP and other accelerators. This does include DVFS for core/cluster, what
we actually need for CPUFreq.
The specification is officially published, the protocol itself and
SCMI based drivers are already upstreamed to Linux.
ARM SCMI protocol
http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_S
ystem_Control_and_Management_Interface.pdf
ARM SCPI protocol
http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message
_interface_v1_2_DUI0922G_en.pdf
“Xen+SCP” solution (1/4) DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
DEVELOPER AND
DESIGN SUMMIT
SCP scales pCPUs
The generic idea of this approach is that there is a firmware, which being
a server runs on “dedicated IP core”, provides an SCPI services. On the
other side there is a CPUFreq scaling driver in Xen, which being a client,
consumes these services (DVFS).
Xen driver neither changes frequency/voltage by itself nor cooperates with
Linux hwdom in order to do such job. It just signals OPP change request to
SCP directly using SCMI protocol. What Xen driver also needs is to query all
supported by platform’s OPPs when it starts.
Requirements
• Some integrated into a SoC mailbox IP is required for IPC
(simple doorbell for triggering action and shared memory
region for commands)
• This approach implies that corresponding firmware which
acts as SCP is already present
The possible issue here is in presence of that “dedicated IP core”
• It may be absent on target platform at all
• It may perform other than PM tasks
But is the lack of “dedicated IP core” a blocker?
“Xen+SCP” solution (2/4)
2018
DEVELOPER AND
DESIGN SUMMIT
DEVELOPER AND
DESIGN SUMMIT
No, the lack of “dedicated IP core” isn’t a blocker.
Andre Przywara’s “SMC triggered mailbox” approach with his
PoC code for Allwinner demonstrates that.
The idea is to teach firmware which runs on the very same AP cores as
the Xen runs, but in the EL3 exception level to perform SCP functions
and use Secure Monitor Call (SMC) calls for communications. Such
solution is going to be a good compromise for all ARM platforms that do
have firmware running in the EL3 exception level (for example ARM TF)
and don’t have candidate for being SCP. Even a dedicated mailbox IP is
not needed for this purpose.
The “SMC triggered mailbox” driver emulates a mailbox which signals
transmitted data via SMC instruction. The mailbox receiver is
implemented in firmware and synchronously returns data when it
returns execution to the non-secure world again. This would allow us
both to trigger a request and transfer execution to the firmware code in
a safe and architected way (like PSCI requests).
SMC triggered mailbox
http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm-
introduce-smc-triggered-mailbox
SMC/HVC
http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0
028B_SMC_Calling_Convention.pdf
“Xen+SCP” solution (3/4)
2018
DEVELOPER AND
DESIGN SUMMIT“Xen+SCP” solution (4/4)
P R O S
• It is a secure and architecturally clear solution. There is no
reason not to trust firmware in embedded microcontroller
or firmware running in trusted SW layer (like ARM TF).
• It is going to be a generic solution which can be easily
reused by many ARM platforms.
The platform-dependent component will be just the
mailbox driver for implementing the minimum
set of functions. So what we will have to add for support
new platform is a simple mailbox driver.
• Easy to implement Xen side comparing to the previous
approaches. Once the common part was implemented, it
would be easy to add support for a new platform, upon a
condition that it already has the proper firmware.
• No new ABI, hypercalls, syscalls, DT bindings, etc like in
“Xen+hwdom” approach. No complex communication
interface.
• Only this approach allows to have guest inputs regarding
the frequency change here in Xen with minimal
modifications in code.
C O N S
• The corresponding firmware which provides the SCPI
services must be present. It can be either a firmware
which runs on “dedicated IP core” or a firmware which
runs on the very same AP cores as the Xen runs, but in the
EL3 exception level (ARM TF).
• It may be needed to emulate all SCPI commands in Xen (to
be an SCP for guests). SCPI protocol in Linux may be used
for other than DVFS things, but also for device runtime
PM, clock management, so we can’t drop this ability just
because we want to run “CPUFreq enabled” Xen. This is
SCPI limitation only, which seems isn’t able to manage
parallel connections to SCP from different SW layers
correctly, unlike SCMI.
What we want is a generic, secure and architecturally clean solution.
I thought (and still think) that “Xen+SCP” solution is more appropriate for
this target across all considered solutions. And CPUFreq PoC implements
exactly this solution with some limitations though.
However, if you have yet another solution we can consider it as well.
Conclusion
2018
CPUFreq PoC (1/14)
A R M S C P I
I built this PoC on top of SCPI protocol.
But, why not SCMI protocol?
• When I was doing a research the upstream
Linux support for SCMI was missed, but
SCPI support had been already
upstreamed, there were enough good
examples how to use it
• The range of capabilities the SCPI had was
enough for implementing this PoC
The situation has been changing since that
time and we will probably move to SCMI
for the final solution. Or Xen may support
both protocols, it is discussable…
“ S M C T R I G G E R E D
M A I L B O X ”
I borrowed the idea of “SMC triggered
mailbox” driver which emulates a mailbox
which signals transmitted data via SMC
instruction and firmware running in the
EL3 exception level (ARM TF).
The reason was in the lack of free
“dedicated IP core” for providing SCPI
services on the “Renesas R-Car Gen3”
platform I worked with. And the idea of
using ARM TF as an SCPI server and SMC
calls for communication looked
reasonable to me.
M O D I F I E D F I R M W A R E
( A R M T F )
In my case it was feasible to modify ARM
TF as official BSP release I used had both
firmware and software. In classic
embedded scenario where both firmware
and software are provided by the same
entity, it is going to be feasible as well.
Using Andre Przywara’s PoC for Allwinner
as an example I managed to prepare
something working for R-Car Gen3.
1 2 3
Main points
2018
DEVELOPER AND
DESIGN SUMMIT
The CPUFreq feature works out of the box if we
run bare Linux from vendor’s BSP.
The BSP release for Renesas R-Car Gen3 platform comes
with CPUFreq support enabled in Linux, it uses “dt-
cpufreq” driver. The OPPs, clocks and cpu-supply
properties are described in the platform DT file and this
driver extracts this information.
Example of original pCPU node
CPUFreq PoC (2/14) DEVELOPER AND
DESIGN SUMMIT
a57_0: cpu@0 {
compatible = "arm,cortex-a57", "arm,armv8";
reg = <0x0>;
device_type = "cpu";
power-domains = <&sysc R8A7795_PD_CA57_CPU0>;
next-level-cache = <&L2_CA57>;
enable-method = "psci";
cpu-idle-states = <&CPU_SLEEP_0>;
#cooling-cells = <2>;
dynamic-power-coefficient = <854>;
cooling-min-level = <0>;
cooling-max-level = <2>;
clocks =<&cpg CPG_CORE R8A7795_CLK_Z>;
operating-points-v2 = <&cluster0_opp_tb0>,
<&cluster0_opp_tb1>, <&cluster0_opp_tb2>,
<&cluster0_opp_tb3>, <&cluster0_opp_tb4>,
<&cluster0_opp_tb5>, <&cluster0_opp_tb6>,
<&cluster0_opp_tb7>;
cpu-supply = <&vdd_dvfs>;
};
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (3/14) DEVELOPER AND
DESIGN SUMMIT
But, If we run that Linux as Domain 0
(Hardware domain) on top of Xen we get
CPUFreq feature broken.
When creating DT for the domain Xen inserts only dummy
CPU nodes. And the number of these inserted CPU nodes
is equal to the number of vCPUs assigned to this domain.
All CPU properties which original DT has, such as OPP,
clock, regulator, etc are not passed to the guest’s DT.
Example of guest vCPU node
cpu@0 {
device_type = "cpu";
compatible = "arm,armv8";
enable-method = "psci";
reg = <0x0>;
};
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (4/14) DEVELOPER AND
DESIGN SUMMIT
It started from this point...
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (5/14) DEVELOPER AND
DESIGN SUMMIT
Changes in Linux guest
• Disable CPUFreq feature in Linux defconfig
• Remove from DT, platform files all “involved in CPU
scaling” components such as clocks, DVFS I2C bus, etc.
Leaving such components available to guest
could negatively affect CPU scaling from other
SW layers (Xen, ARM TF). Linux guest may
decide it is unused, thus can be disabled...
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (6/14) DEVELOPER AND
DESIGN SUMMIT
Changes in ARM TF
• Modify firmware to be able to act as SCPI server and
provide DVFS services (just an emulator for that
moment to be able to develop Xen side).
Normally we write drivers for the existing
firmware, not vise versa...
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (7/14) DEVELOPER AND
DESIGN SUMMIT
Changes in DT
• Add SCPI node with all required properties such as
shmem, mboxes, and so on
• Enable hypervisor based CPUFreq and set an initial
governor from DT command line
Changes in Xen
• Rebase Oleksandr Dmytryshyn’s patch series which
makes ACPI specific CPUFreq stuff more generic
• Port bunch of DT helpers and macros from Linux
• Create a few misc patches for ARM subsystem
(some preparations for SCPI based CPUFreq).
There is a special binding which is intended to define
the interface the firmware implementing SCPI.
The DT parsing code in Xen needs to be compatible with
the existing DTs describing the SCPI implementation.
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (8/14) DEVELOPER AND
DESIGN SUMMIT
Changes in Xen
• Port the whole SCPI protocol from Linux
• Port mailbox framework from Linux (as protocol relies
on mailbox feature to exist)
• Add modifications to the directly ported code
(Xen is not allowed to sleep, so there will be no
mutexes, completions, etc)
There is definitely a way for optimization...
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (9/14) DEVELOPER AND
DESIGN SUMMIT
Changes in Xen
• Port “SMC triggered mailbox” driver
and modify it a bit.
“SMC triggered mailbox” is a completely
“synchronous” mailbox because of SMC nature.
But, “asynchronous” mailboxes can be used as well. The
one limitation is that mailbox HW must have RX-done
interrupt. The possible candidates are
• ARM MHU
• Rockchip mailbox
• Whatever
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (10/14) DEVELOPER AND
DESIGN SUMMIT
Changes in Xen
• Develop CPUFreq interface component
Interface component performs following steps
• Initialize everything needed for CPUFreq scaling
driver to be functional inside Xen (SCPI protocol,
mailbox, etc)
• Register future CPUFreq scaling driver
• Populate CPUs
• Get DVFS info which is the OPP list and the latency
information for all online DVFS capable CPUs using
SCPI protocol
• Convert these capabilities into performance states,
rearrange it, as performance states must start from
higher values
• Upload the resulting PM data to CPUFreq framework
Hardware domain doesn’t need to be involved
(ACPI parser case on x86), since we already have
everything in hand here in Xen.
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (11/14) DEVELOPER AND
DESIGN SUMMIT
Changes in Xen
• Develop CPUFreq scaling driver which
acts as SCPI client
Driver main responsibility is to signal OPP
change request directly to SCP.
Driver uses only three SCPI ops (three commands are
sent to SCP)
• dvfs_get_info
• dvfs_set_idx
• dvfs_get_idx
Also driver needs to maintain cpufreq_table and care
about matching the CPU frequency to OPP index and vise
versa.
Driver as well as interface components were
developed in a platform agnostic way.
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (12/14) DEVELOPER AND
DESIGN SUMMIT
Changes in ARM TF (final)
• Add ability to physically set CPU OPP
That’s all.
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (13/14) DEVELOPER AND
DESIGN SUMMIT
The proposed solution “Xen + SCP” (SCPI based
CPUFreq) is not limited by only using ARM TF
providing SCPI services.
If the ARM platform you are working with has a
“dedicated IP core” already providing SCPI services then
even better, the only one thing you need to do is an
appropriate mailbox driver for the real mailbox HW to be
able to communicate with SCP on your platform. And this
mailbox driver is the only one platform dependent
component.
So, focus on your mailbox driver only.
Important note
Xen toolstack wasn’t modified. In order to minimize
changes in common code and retain current ABI
the SCPI based CPUFreq was made in a way to be
absolutely OK with ACPI specific P-states.
So, “xenpm” tool works out of the box on ARM.
2018
DEVELOPER AND
DESIGN SUMMITCPUFreq PoC (14/14) DEVELOPER AND
DESIGN SUMMIT
Status
This PoC works, it is stable enough, there is no crashes, freezes and other weird things. It is possible to control CPUFreq parameters via “xenpm”
tool, this feature isn’t broken. I sent RFC patch series which initially implements “Xen+SCP” solution to the Xen ML in autumn 2017.
Patch series consists of about 30 patches. Patch series looks quite huge at the first glance, it pulls a lot of verbatim code from Linux, but I believe
that the number of lines of code it adds can be significantly reduced.
Example of Xen boot log
(XEN) scpi
(XEN) scpi
(XEN) cpu0
(XEN) cpu0
(XEN) cpu0
(XEN) cpu0
(XEN) cpu0
(XEN) cpu1
(XEN) cpu2
(XEN) cpu3
(XEN) cpu0
(XEN) cpu4
(XEN) cpu4
(XEN) cpu5
(XEN) cpu5
(XEN) cpu6
(XEN) cpu6
(XEN) cpu7
(XEN) cpu7
(XEN) initialized SCPI based CPUFreq
Example of xenpm output
root@generic-armv8-xt-dom0:~# xenpm get-cpufreq-para 0
cpu id
affected_cpus
cpuinfo frequency
scaling_driver
scaling_avail_gov
current_governor
ondemand specific
sampling_rate
up_threshold
scaling_avail_freq
scaling frequency
turbo mode
: 0
: 0 1 2 3
: max [1700000] min [500000] cur [500000]
: scpi-cpufreq
: userspace performance powersave ondemand
: ondemand
:
: max [150000000] min [150000] cur [300000]
: 80
: 1700000 1600000 1500000 1000000 *500000
: max [1700000] min [500000] cur [500000]
: enabled
: /mailbox@0: ARM SMC mailbox enabled with 2 chans.
: /scpi: SCP Protocol 1.2 Firmware 1.0.0 version
: is DVFS capable, belongs to pd0
: Turbo freq detected: 1700000
: Turbo Mode detected and enabled
: Turbo freq detected: 1600000
: set Px states
: set Px states
: set Px states
: set Px states
: uploaded cpufreq data
: failed to get clock node
: isn't DVFS capable, skip it
: failed to get clock node
: isn't DVFS capable, skip it
: failed to get clock node
: isn't DVFS capable, skip it
: failed to get clock node
: isn't DVFS capable, skip it
2018
DEVELOPER AND
DESIGN SUMMIT
DEVELOPER AND
DESIGN SUMMIT
Renesas Salvator-X board with R-Car Gen3 H3 ES2.0 SoC
• Four Cortex-A57 cores (DVFS capable)
• Four Cortex-A53 cores
• Cortex R7 core
• IMG PowerVR Series6XT GPU
• 4 GB RAM
“Demo system” powered by Xen hypervisor was shown
at CES 2018
• Xen hypervisor + CPUFreq patches
• Thin domain 0 (generic ARMV8 Linux)
• Cluster domain D (AGL + R-Car BSP)
• Cloud services domain F (generic ARMV8 Linux)
• IVI domain A (R-Car Android Auto)
Important note
The different frequencies for each CPU core are not allowed.
Only one frequency for all CPU cores inside a cluster.
Benchmarking results
Setup used for benchmarking
2018
DEVELOPER AND
DESIGN SUMMITDemo system
Cloud services domain
(generic armv8 Linux)
IVI domain
(R-Car Android Auto)
Cluster domain
(AGL + R-Car BSP)
Thin domain 0
(generic armv8 Linux)
Xen + CPUFreq patches
ARM Trusted Firmware
OP-TEE
Pictures from:
http://docs.automotivelinux.org/docs/getting_started/en/dev/reference/homescreen/
https://www.androidcentral.com/android-auto/
2018
DEVELOPER AND
DESIGN SUMMIT
Power consumption measurement (A57’s cluster only)
1. With DVFS
CPUFReq settings: Ondemand governor,
Turbo mode enabled, CPU OPPs:
• 500 MHz (low)
• 1000 MHz (low)
• 1500 MHz (nom)
• 1600 MHz (high)
• 1700 MHz (high)
2. Without DVFS
CPUFreq settings: Userspace governor, single CPU OPP –
1500 MHz (nom). Which is equivalent to what we
actually have on bare system.
Use cases (most are typical for Android)
• Android home screen with static picture on display
• Audio playback in Android using built-in Media player
• Audio playback in Linux using AGL player
• Provided by Kitchen Sink demo app “3D Cubes test”
• Video playback using Youtube (SW decoding)
• Navigation using Google Maps
What to measure? DEVELOPER AND
DESIGN SUMMIT
2018
How to measure?
M E A S U R E M E N T T O O L
Fluke 87 Multimeter
(with AVG option)
M E A S U R E M E N T S T E P S
• Measure Voltage between 2 heads
of shunt resistor
• Calculate Current
• Measure Voltage between first head
of shunt resistor and ground
• Calculate Power
M E A S U R E M E N T S C H E M E
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
U S E C A S E ( P D V F S / P N O M - 1 ) * 1 0 0 % R E M A R K S
Home screen
(background tasks)
- 10,8 Low OPPs
Audio playback
(built-in Media player)
- 12,2 ---//---
Audio playback
(AGL player)
- 13,6 ---//---
3D Cubes test
(Kitchen sink)
- 11,9 ---//---
SW Video playback (youtube) - 9,8 ---//---
Navigation
(google maps)
+ 27,3 High (turbo) OPPs
Power consumption (A57’s cluster)
Interesting fact: Audio playback consumes minimal CPU resources
2018
DEVELOPER AND
DESIGN SUMMIT
Usage OPP turbo (1.7 GHz) during booting reduces average boot time by 8%
comparing to OPP nom (1.5 GHz).
Expect more profit on M3 SoC which has higher OPP turbo (1.8 GHz).
Boot time
Low boot time is quite important in many areas...
2018
DEVELOPER AND
DESIGN SUMMIT
It was proposed by ARM guys in Xen ML.
It is supposed to affect the whole CPUFreq subsystem in Xen.
Currently the decision about frequency change is made by
Xen exclusively.
Xen scheduling vCPUs doesn’t know much about guest
internals
• All Xen sees are: trap on MMIO, hypercall, WFI
• Linux guest can track the actual utilization of a vCPU, by
keeping statistics of runnable processes and monitoring
their time slice usage. But Xen doesn’t see this
information
Therefore Xen needs additional input from guests to make a
decision on the proper frequency pCPU should run with.
The idea is that guests could provide some input by signalling
OPP change request up to the Xen. And Xen could then
decide to act on it or not.
Improvement (guest input) DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
1. Tell Xen about PM strategies to use for certain guests
(via tools in Domain 0)
Different guests should be treated differently
• For RT guests - constant frequency
• For entertainment - varying frequency, based on guest input
May involve CPU pinning for certain class of guests.
2. Allow some guests (according to policy) to signal OPP
change requests to Xen
Xen takes those into account, though it may decide to not
act immediately on it.
3. Have some way of actually realising certain OPPs
Via an SCPI/SCMI client in Xen or in another way.
Combined approach of CPUFreq
in Xen
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMITTwo possible options with guest input
The first option has clear and straight logic. And it’s implementation would be simpler.
XEN DOESN’T HAVE CPUFREQ LOGIC AT ALL
• It doesn’t measure pCPU utilization
• It collects OPP change requests from all allowed guests
• It makes a decision based on these requests and some
policies
− Tell Xen (via tools in Domain 0) about static vCPU
frequency settings which guest OPP change requests
may or may not override
• It sends final OPP change request to SCP
XEN HAS CPUFREQ LOGIC
• It measures pCPU utilization
• In addition it can collect OPP change requests from all
allowed guests
• It makes a decision based on both: its own point of view
and received guest requests
− Is new governor needed for handling this?
We may reuse the idea of x86’s APERF/MPERF: guest
OPP change requests may be considered
as SW performance counters
• It sends final OPP change request to SCP
1 2
2018
DEVELOPER AND
DESIGN SUMMIT
Andre Przywara’s “SMC triggered mailbox” solution
Use SMC mailbox for providing virtual SCPI services to
guest in a generic, non-SoC specific way.
(SMC mailbox binging allows using “HVC” calls to trigger
services, so Xen could pick guest DVFS requests up and
acts upon them).
How it is supposed to work
• Xen creates virtual SCPI DT nodes for guest, and use
SMC mailbox with method “HVC”
• Xen “HVC” handler then redirects guest DVFS
requests to CPUFreq code
Goals
• No extra PV protocol
• Platform agnostic for guests, while making
all guest request ending up in Xen
• Simple and clear flow
How will guest signal OPP change
request to Xen?
DEVELOPER AND
DESIGN SUMMIT
2018
DEVELOPER AND
DESIGN SUMMIT
My email:
Oleksandr_Tyshchenko@epam.com
My patch series for Xen:
https://github.com/otyshchenko1/xen.git
branch: cpufreq-devel-next
My patch series for ARM TF:
https://github.com/otyshchenko1/arm-trusted-firmware-1.git
branch: scp-devel-next
Useful links DEVELOPER AND
DESIGN SUMMIT
Thank you!

More Related Content

PDF
XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
The Linux Foundation
 
PPTX
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
Gopi Krishnamurthy
 
PPT
Linux memory
ericrain911
 
ODP
Memory management in Linux
Raghu Udiyar
 
PPTX
Linux Memory Management with CMA (Contiguous Memory Allocator)
Pankaj Suryawanshi
 
PDF
HKG15-107: ACPI Power Management on ARM64 Servers (v2)
Linaro
 
PDF
Making Linux do Hard Real-time
National Cheng Kung University
 
PDF
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Linaro
 
XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
The Linux Foundation
 
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
Gopi Krishnamurthy
 
Linux memory
ericrain911
 
Memory management in Linux
Raghu Udiyar
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Pankaj Suryawanshi
 
HKG15-107: ACPI Power Management on ARM64 Servers (v2)
Linaro
 
Making Linux do Hard Real-time
National Cheng Kung University
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Linaro
 

What's hot (20)

PDF
Uboot startup sequence
Houcheng Lin
 
PDF
New Ways to Find Latency in Linux Using Tracing
ScyllaDB
 
PDF
Logging system of Android
Tetsuyuki Kobayashi
 
PDF
Linux Memory Management
Anil Kumar Pugalia
 
PPTX
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Memory Fabric Forum
 
PDF
LAS16-200: SCMI - System Management and Control Interface
Linaro
 
PPTX
Understanding DPDK
Denys Haryachyy
 
PDF
Linux Profiling at Netflix
Brendan Gregg
 
PDF
BPF - in-kernel virtual machine
Alexei Starovoitov
 
PDF
Linux scheduler
Liran Ben Haim
 
PDF
Container Performance Analysis
Brendan Gregg
 
PDF
The Linux Kernel Scheduler (For Beginners) - SFO17-421
Linaro
 
PPTX
Deep Dive into the Linux Kernel - メモリ管理におけるCompaction機能について
NTT DATA Technology & Innovation
 
PPTX
MemVerge: The Software Stack for CXL Environments
Memory Fabric Forum
 
PDF
Linux Performance Tools
Brendan Gregg
 
PDF
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
The Linux Foundation
 
PDF
Making Linux do Hard Real-time
National Cheng Kung University
 
PPTX
SK hynix CXL Disaggregated Memory Solution
Memory Fabric Forum
 
ODP
Continguous Memory Allocator in the Linux Kernel
Kernel TLV
 
PDF
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
Uboot startup sequence
Houcheng Lin
 
New Ways to Find Latency in Linux Using Tracing
ScyllaDB
 
Logging system of Android
Tetsuyuki Kobayashi
 
Linux Memory Management
Anil Kumar Pugalia
 
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Memory Fabric Forum
 
LAS16-200: SCMI - System Management and Control Interface
Linaro
 
Understanding DPDK
Denys Haryachyy
 
Linux Profiling at Netflix
Brendan Gregg
 
BPF - in-kernel virtual machine
Alexei Starovoitov
 
Linux scheduler
Liran Ben Haim
 
Container Performance Analysis
Brendan Gregg
 
The Linux Kernel Scheduler (For Beginners) - SFO17-421
Linaro
 
Deep Dive into the Linux Kernel - メモリ管理におけるCompaction機能について
NTT DATA Technology & Innovation
 
MemVerge: The Software Stack for CXL Environments
Memory Fabric Forum
 
Linux Performance Tools
Brendan Gregg
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
The Linux Foundation
 
Making Linux do Hard Real-time
National Cheng Kung University
 
SK hynix CXL Disaggregated Memory Solution
Memory Fabric Forum
 
Continguous Memory Allocator in the Linux Kernel
Kernel TLV
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
Ad

Similar to XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems (20)

PDF
Linaro connect : Introduction to Xen on ARM
The Linux Foundation
 
PDF
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
Matteo Ferroni
 
PDF
LCA13: Xen on ARM
Linaro
 
PPS
Xen Euro Par07
congvc
 
PDF
Deep Dive on Amazon EC2 Instances (March 2017)
Julien SIMON
 
PDF
SFO15-407: Performance Overhead of ARM Virtualization
Linaro
 
PPTX
Xen Project Update LinuxCon Brazil
The Linux Foundation
 
PDF
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
The Linux Foundation
 
PDF
XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
The Linux Foundation
 
PDF
Xen in Safety-Critical Systems - Critical Summit 2022
Stefano Stabellini
 
PDF
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
NECST Lab @ Politecnico di Milano
 
PDF
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
Matteo Ferroni
 
PDF
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
The Linux Foundation
 
PDF
Platform Security Summit 18: Xen Security Weather Report 2018
The Linux Foundation
 
PDF
Xen and the art of embedded virtualization (ELC 2017)
Stefano Stabellini
 
PDF
Xen arm
guestc21e5a9
 
PDF
State of ARM-based HPC
inside-BigData.com
 
PDF
Virtualization with Lenovo X6 Blade Servers: white paper
Lenovo Data Center
 
PDF
2018 Genivi Xen Overview Nov Update
The Linux Foundation
 
Linaro connect : Introduction to Xen on ARM
The Linux Foundation
 
[EWiLi2016] Towards a performance-aware power capping orchestrator for the Xe...
Matteo Ferroni
 
LCA13: Xen on ARM
Linaro
 
Xen Euro Par07
congvc
 
Deep Dive on Amazon EC2 Instances (March 2017)
Julien SIMON
 
SFO15-407: Performance Overhead of ARM Virtualization
Linaro
 
Xen Project Update LinuxCon Brazil
The Linux Foundation
 
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
The Linux Foundation
 
XPDS13: Xen on ARM Update - Stefano Stabellini, Citrix
The Linux Foundation
 
Xen in Safety-Critical Systems - Critical Summit 2022
Stefano Stabellini
 
XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hy...
NECST Lab @ Politecnico di Milano
 
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
Matteo Ferroni
 
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
The Linux Foundation
 
Platform Security Summit 18: Xen Security Weather Report 2018
The Linux Foundation
 
Xen and the art of embedded virtualization (ELC 2017)
Stefano Stabellini
 
Xen arm
guestc21e5a9
 
State of ARM-based HPC
inside-BigData.com
 
Virtualization with Lenovo X6 Blade Servers: white paper
Lenovo Data Center
 
2018 Genivi Xen Overview Nov Update
The Linux Foundation
 
Ad

More from The Linux Foundation (20)

PDF
ELC2019: Static Partitioning Made Simple
The Linux Foundation
 
PDF
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Unikraft Weather Report
The Linux Foundation
 
PDF
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
The Linux Foundation
 
PDF
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
The Linux Foundation
 
PDF
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
The Linux Foundation
 
PPTX
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
The Linux Foundation
 
PPTX
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
The Linux Foundation
 
PDF
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
The Linux Foundation
 
PDF
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
The Linux Foundation
 
PDF
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
The Linux Foundation
 
PDF
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
The Linux Foundation
 
PDF
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
The Linux Foundation
 
PDF
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
The Linux Foundation
 
PDF
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
The Linux Foundation
 
PDF
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
The Linux Foundation
 
PDF
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
The Linux Foundation
 
ELC2019: Static Partitioning Made Simple
The Linux Foundation
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
The Linux Foundation
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
The Linux Foundation
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
The Linux Foundation
 
XPDDS19 Keynote: Unikraft Weather Report
The Linux Foundation
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
The Linux Foundation
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
The Linux Foundation
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
The Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
The Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
The Linux Foundation
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
The Linux Foundation
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
The Linux Foundation
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
The Linux Foundation
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
The Linux Foundation
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
The Linux Foundation
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
The Linux Foundation
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
The Linux Foundation
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
The Linux Foundation
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
The Linux Foundation
 

Recently uploaded (20)

PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
The Future of Artificial Intelligence (AI)
Mukul
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Doc9.....................................
SofiaCollazos
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Simple and concise overview about Quantum computing..pptx
mughal641
 

XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems

  • 1. Approach for CPUFreq in Xen on ARM Oleksandr Tyshchenko, Lead Software Engineer EPAM Systems Inc.
  • 2. 2018 DEVELOPER AND DESIGN SUMMITIntroduction 2 We are a team of developers at EPAM Systems Inc., based in Kyiv, Ukraine We are focused on: • Xen on ARM • Renesas R-Car Gen3 platform support and maintenance in Xen • Real-Time use-cases • Automotive use-cases • Para-virtual drivers and backends: sound, display, GPU • SoC’s HW virtualization • TEE integration • Power management • Android HALs • FuSa ISO 61508/26262 certification • We are members of GENIVI and AGL and pushing for Xen usage there • Yocto based build system for multi-domain distributions We are upstreaming to Xen Project, Linux Kernel and many OSS projects: see us at https://github.com/xen-troops DEVELOPER AND DESIGN SUMMIT
  • 3. 2018 Agenda 1 W H AT I S I T A B O U T ? 2 3 4 5 C U R R E N T S TAT U S P O S S I B L E A P P R O A C H E S C P U F r e q P O C B E N C H M A R K I N G R E S U LT S DEVELOPER AND DESIGN SUMMIT 6 I M P R O V E M E N T ( G U E S T I N P U T )
  • 4. 2018 DEVELOPER AND DESIGN SUMMITWhat benefit is expected from CPUFreq? 4 Xen becomes more popular in embedded world these days and the demand for CPUFreq comes from this world. There are two orthogonal key things the CPUFreq can help with: • Performance Use of higher frequencies where required − Get better performance on “heavy” use cases − Speed up boot process • Power savings Use of lower frequencies where possible − Save power on “light” use cases (improve the autonomy of battery powered devices) − Prevent possible CPU overheating which may reduce its life DEVELOPER AND DESIGN SUMMIT
  • 5. 2018 DEVELOPER AND DESIGN SUMMITWhy Xen needs to be involved? 5 CPUFreq feature works quite good in Linux these days (a lot of ARCHs are supported, including ARM). So why don’t just pass through CPUFreq related stuff to Hardware domain and let things work as they work in bare Linux? CPU virtualization is done by Xen. Abstract guest doesn’t know anything about pCPUs because it runs on vCPUs. It is not aware of: • How many pCPUs the entire system has • What is the CPU topology • What is the actual CPU load • How vCPUs it owns correspond to pCPUs Obviously we can’t create a working solution without modifying Xen as only it is the single system component which has required information. DEVELOPER AND DESIGN SUMMIT
  • 6. 2018 DEVELOPER AND DESIGN SUMMIT There are two possible modes 1. Domain 0 based CPUFReq 2. Hypervisor based CPUFReq The “Hypervisor based CPUFReq” is more interesting and appropriate to be used on ARM based embedded platforms because the “Domain 0 based CPUFReq” implies: • Keeping the number of vCPUs in Domain 0 equal to the number of pCPUs and doing a vCPU pinning (which is not always welcome) • Having CPUFreq infrastructure, including HW drivers, in Domain 0 (may conflict with “Thin Domain 0” or “non-Linux Domain 0” requirements) Current status Xen on x86 already has CPUFreq support out of the box. But, ARM support is missing so far. DEVELOPER AND DESIGN SUMMIT
  • 7. 2018 DEVELOPER AND DESIGN SUMMIT It has all required components • Core • Governor • Scaling driver It supports two scaling drivers for x86 • ACPI Processor P-States Driver • AMD Architectural P-state It requires Domain 0 to be involved, but not to pass requests to HW, just to • Parse ACPI table and upload information to Xen • Control CPUFreq parameters in run-time, collect statistic It is ACPI specific... Hypervisor based CPUFreq DEVELOPER AND DESIGN SUMMIT Picture from https://wiki.xenproject.org/wiki/Xen_power_management
  • 8. 2018 DEVELOPER AND DESIGN SUMMIT There was an attempt from Oleksandr Dmytryshyn to create support for ARM. Which included activity for making ACPI specific “Hypervisor based CPUFreq” more generic. He proposed split model, where frontend driver in Xen interacts with the backend driver in Linux hardware domain in order to scale pCPUs. Linux patches (RFC V5) https://lists.xen.org/archives/html/xen-devel/2014-11/msg00944.html Xen patches (RCF V5) https://lists.xen.org/archives/html/xen-devel/2014-11/msg00940.html * The pros and cons of this approach will be considered in “Possible approaches” section. ARM support is missing so far DEVELOPER AND DESIGN SUMMIT
  • 9. 2018 DEVELOPER AND DESIGN SUMMIT The main problem is “frequency changing interface” in virtualized system. CPUFreq core just makes a decision and issues platform dependent call which contains the frequency for the CPU to run with. Who should be in charge of physically setting new frequency? • Xen • Hardware domain • Dedicated IP core (PM coprocessor) The list of required components which are usually involved in Dynamic Voltage and Frequency Scaling (DVFS) is quite big. Also it may vary across different ARM platforms. For “Renesas R-Car Gen3” platforms they are: • Generic clock, regulator, thermal frameworks • Platform clock, PMIC, AVS, THS drivers • I2C support, etc • Definitely I missed something The possible approaches we are about to consider differ exactly in frequency changing interface. Possible approaches of CPUFreq on ARM
  • 10. 2018 DEVELOPER AND DESIGN SUMMIT Hardware domain scales pCPUs Oleksandr Dmytryshyn proposed split model, where “hwdom-cpufreq” frontend driver in Xen interacts with the “xen-cpufreq” backend driver in Linux hardware domain in order to scale pCPUs. But this solution hasn’t been accepted by Xen community yet. Status of this enabling work is freezed (there haven’t been any activities on this topic since 2014). “Xen + hwdom” solution P R O S • The beauty of this approach is that we don’t need to port a lot of DVFS specific things to Xen. The CPUFreq scaling driver in Xen which doesn’t pass requests to HW but asks hwdom about that is going to be a generic solution which can be easily reusable by many ARM platforms. C O N S • The way how the “xen-cpufreq” backend driver and glue layer were implemented won’t be acceptable by the Linux community. Redesign is definitely needed. • Complex communication interface between Xen and hwdom: event channel, hypercalls, syscalls, custom DT bindings in order to provide pCPU info to hwdom, etc. Still unanswered major questions from Xen community regarding synchronization. • Isn’t a completely safe solution. Letting guest manage device PM is one thing, but letting it manage one or more pCPU is a big risk. Guest operating with HW may crash, even the hardware domain. Malicious domain may power down the whole system!
  • 11. 2018 DEVELOPER AND DESIGN SUMMIT Xen scales pCPUs This implies that all DVFS specific things should be located in Xen. Obviously it is not supposed to be a copy of huge Linux frameworks, in Xen it is supposed to be much simpler than in Linux, but it is quite clear that we will have to port all required support for the CPUFreq scaling drivers in Xen to be able to actually realise certain OPPs. * OPP - operation performance point is a tuple of frequency and minimum voltage. P R O S • Although non-generic, but safe and more architecturally cleaner solution than “Xen+hwdom” one. Having all in Xen we don’t depend neither on potentially malicious domain’s behavior nor on buggy drivers running. C O N S • We are going to end up having a lot of code in Xen, since we will have to keep as many CPUFreq scaling driver implementations as many ARM platforms we will want to support in Xen. • Enormous developing effort is expected to get this support in (the scope of required components looks huge) and maintain it. What is more, it may be not even feasible to implement this on some ARM platforms (separate CPU clock, synchronization issues). “All-in-Xen” solution
  • 12. 2018 DEVELOPER AND DESIGN SUMMIT It is yet another approach based on generic and popular at the moment ARM SСMI(SCPI) protocol. What SCMI protocol is? The System Control and Management Interface (SCMI) protocol follows the recent trend in industry to provide embedded microcontrollers in systems to abstract various power and other system management tasks. The protocol is supposed to be used between this embedded microcontroller, which is the System Control Processor(SCP) in terms of protocol and the Application Processors(AP). The mailbox feature provides a mechanism for inter-processor communication (IPC). Typically SCP provides a lot of services and one of these services is a performance management which is the ability to control the performance of a domain which is composed of compute engines such as AP and other accelerators. This does include DVFS for core/cluster, what we actually need for CPUFreq. The specification is officially published, the protocol itself and SCMI based drivers are already upstreamed to Linux. ARM SCMI protocol http://infocenter.arm.com/help/topic/com.arm.doc.den0056a/DEN0056A_S ystem_Control_and_Management_Interface.pdf ARM SCPI protocol http://infocenter.arm.com/help/topic/com.arm.doc.dui0922g/scp_message _interface_v1_2_DUI0922G_en.pdf “Xen+SCP” solution (1/4) DEVELOPER AND DESIGN SUMMIT
  • 13. 2018 DEVELOPER AND DESIGN SUMMIT DEVELOPER AND DESIGN SUMMIT SCP scales pCPUs The generic idea of this approach is that there is a firmware, which being a server runs on “dedicated IP core”, provides an SCPI services. On the other side there is a CPUFreq scaling driver in Xen, which being a client, consumes these services (DVFS). Xen driver neither changes frequency/voltage by itself nor cooperates with Linux hwdom in order to do such job. It just signals OPP change request to SCP directly using SCMI protocol. What Xen driver also needs is to query all supported by platform’s OPPs when it starts. Requirements • Some integrated into a SoC mailbox IP is required for IPC (simple doorbell for triggering action and shared memory region for commands) • This approach implies that corresponding firmware which acts as SCP is already present The possible issue here is in presence of that “dedicated IP core” • It may be absent on target platform at all • It may perform other than PM tasks But is the lack of “dedicated IP core” a blocker? “Xen+SCP” solution (2/4)
  • 14. 2018 DEVELOPER AND DESIGN SUMMIT DEVELOPER AND DESIGN SUMMIT No, the lack of “dedicated IP core” isn’t a blocker. Andre Przywara’s “SMC triggered mailbox” approach with his PoC code for Allwinner demonstrates that. The idea is to teach firmware which runs on the very same AP cores as the Xen runs, but in the EL3 exception level to perform SCP functions and use Secure Monitor Call (SMC) calls for communications. Such solution is going to be a good compromise for all ARM platforms that do have firmware running in the EL3 exception level (for example ARM TF) and don’t have candidate for being SCP. Even a dedicated mailbox IP is not needed for this purpose. The “SMC triggered mailbox” driver emulates a mailbox which signals transmitted data via SMC instruction. The mailbox receiver is implemented in firmware and synchronously returns data when it returns execution to the non-secure world again. This would allow us both to trigger a request and transfer execution to the firmware code in a safe and architected way (like PSCI requests). SMC triggered mailbox http://linux-sunxi.narkive.com/qYWJqjXU/patch-v2-0-3-mailbox-arm- introduce-smc-triggered-mailbox SMC/HVC http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0 028B_SMC_Calling_Convention.pdf “Xen+SCP” solution (3/4)
  • 15. 2018 DEVELOPER AND DESIGN SUMMIT“Xen+SCP” solution (4/4) P R O S • It is a secure and architecturally clear solution. There is no reason not to trust firmware in embedded microcontroller or firmware running in trusted SW layer (like ARM TF). • It is going to be a generic solution which can be easily reused by many ARM platforms. The platform-dependent component will be just the mailbox driver for implementing the minimum set of functions. So what we will have to add for support new platform is a simple mailbox driver. • Easy to implement Xen side comparing to the previous approaches. Once the common part was implemented, it would be easy to add support for a new platform, upon a condition that it already has the proper firmware. • No new ABI, hypercalls, syscalls, DT bindings, etc like in “Xen+hwdom” approach. No complex communication interface. • Only this approach allows to have guest inputs regarding the frequency change here in Xen with minimal modifications in code. C O N S • The corresponding firmware which provides the SCPI services must be present. It can be either a firmware which runs on “dedicated IP core” or a firmware which runs on the very same AP cores as the Xen runs, but in the EL3 exception level (ARM TF). • It may be needed to emulate all SCPI commands in Xen (to be an SCP for guests). SCPI protocol in Linux may be used for other than DVFS things, but also for device runtime PM, clock management, so we can’t drop this ability just because we want to run “CPUFreq enabled” Xen. This is SCPI limitation only, which seems isn’t able to manage parallel connections to SCP from different SW layers correctly, unlike SCMI.
  • 16. What we want is a generic, secure and architecturally clean solution. I thought (and still think) that “Xen+SCP” solution is more appropriate for this target across all considered solutions. And CPUFreq PoC implements exactly this solution with some limitations though. However, if you have yet another solution we can consider it as well. Conclusion
  • 17. 2018 CPUFreq PoC (1/14) A R M S C P I I built this PoC on top of SCPI protocol. But, why not SCMI protocol? • When I was doing a research the upstream Linux support for SCMI was missed, but SCPI support had been already upstreamed, there were enough good examples how to use it • The range of capabilities the SCPI had was enough for implementing this PoC The situation has been changing since that time and we will probably move to SCMI for the final solution. Or Xen may support both protocols, it is discussable… “ S M C T R I G G E R E D M A I L B O X ” I borrowed the idea of “SMC triggered mailbox” driver which emulates a mailbox which signals transmitted data via SMC instruction and firmware running in the EL3 exception level (ARM TF). The reason was in the lack of free “dedicated IP core” for providing SCPI services on the “Renesas R-Car Gen3” platform I worked with. And the idea of using ARM TF as an SCPI server and SMC calls for communication looked reasonable to me. M O D I F I E D F I R M W A R E ( A R M T F ) In my case it was feasible to modify ARM TF as official BSP release I used had both firmware and software. In classic embedded scenario where both firmware and software are provided by the same entity, it is going to be feasible as well. Using Andre Przywara’s PoC for Allwinner as an example I managed to prepare something working for R-Car Gen3. 1 2 3 Main points
  • 18. 2018 DEVELOPER AND DESIGN SUMMIT The CPUFreq feature works out of the box if we run bare Linux from vendor’s BSP. The BSP release for Renesas R-Car Gen3 platform comes with CPUFreq support enabled in Linux, it uses “dt- cpufreq” driver. The OPPs, clocks and cpu-supply properties are described in the platform DT file and this driver extracts this information. Example of original pCPU node CPUFreq PoC (2/14) DEVELOPER AND DESIGN SUMMIT a57_0: cpu@0 { compatible = "arm,cortex-a57", "arm,armv8"; reg = <0x0>; device_type = "cpu"; power-domains = <&sysc R8A7795_PD_CA57_CPU0>; next-level-cache = <&L2_CA57>; enable-method = "psci"; cpu-idle-states = <&CPU_SLEEP_0>; #cooling-cells = <2>; dynamic-power-coefficient = <854>; cooling-min-level = <0>; cooling-max-level = <2>; clocks =<&cpg CPG_CORE R8A7795_CLK_Z>; operating-points-v2 = <&cluster0_opp_tb0>, <&cluster0_opp_tb1>, <&cluster0_opp_tb2>, <&cluster0_opp_tb3>, <&cluster0_opp_tb4>, <&cluster0_opp_tb5>, <&cluster0_opp_tb6>, <&cluster0_opp_tb7>; cpu-supply = <&vdd_dvfs>; };
  • 19. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (3/14) DEVELOPER AND DESIGN SUMMIT But, If we run that Linux as Domain 0 (Hardware domain) on top of Xen we get CPUFreq feature broken. When creating DT for the domain Xen inserts only dummy CPU nodes. And the number of these inserted CPU nodes is equal to the number of vCPUs assigned to this domain. All CPU properties which original DT has, such as OPP, clock, regulator, etc are not passed to the guest’s DT. Example of guest vCPU node cpu@0 { device_type = "cpu"; compatible = "arm,armv8"; enable-method = "psci"; reg = <0x0>; };
  • 20. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (4/14) DEVELOPER AND DESIGN SUMMIT It started from this point...
  • 21. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (5/14) DEVELOPER AND DESIGN SUMMIT Changes in Linux guest • Disable CPUFreq feature in Linux defconfig • Remove from DT, platform files all “involved in CPU scaling” components such as clocks, DVFS I2C bus, etc. Leaving such components available to guest could negatively affect CPU scaling from other SW layers (Xen, ARM TF). Linux guest may decide it is unused, thus can be disabled...
  • 22. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (6/14) DEVELOPER AND DESIGN SUMMIT Changes in ARM TF • Modify firmware to be able to act as SCPI server and provide DVFS services (just an emulator for that moment to be able to develop Xen side). Normally we write drivers for the existing firmware, not vise versa...
  • 23. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (7/14) DEVELOPER AND DESIGN SUMMIT Changes in DT • Add SCPI node with all required properties such as shmem, mboxes, and so on • Enable hypervisor based CPUFreq and set an initial governor from DT command line Changes in Xen • Rebase Oleksandr Dmytryshyn’s patch series which makes ACPI specific CPUFreq stuff more generic • Port bunch of DT helpers and macros from Linux • Create a few misc patches for ARM subsystem (some preparations for SCPI based CPUFreq). There is a special binding which is intended to define the interface the firmware implementing SCPI. The DT parsing code in Xen needs to be compatible with the existing DTs describing the SCPI implementation.
  • 24. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (8/14) DEVELOPER AND DESIGN SUMMIT Changes in Xen • Port the whole SCPI protocol from Linux • Port mailbox framework from Linux (as protocol relies on mailbox feature to exist) • Add modifications to the directly ported code (Xen is not allowed to sleep, so there will be no mutexes, completions, etc) There is definitely a way for optimization...
  • 25. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (9/14) DEVELOPER AND DESIGN SUMMIT Changes in Xen • Port “SMC triggered mailbox” driver and modify it a bit. “SMC triggered mailbox” is a completely “synchronous” mailbox because of SMC nature. But, “asynchronous” mailboxes can be used as well. The one limitation is that mailbox HW must have RX-done interrupt. The possible candidates are • ARM MHU • Rockchip mailbox • Whatever
  • 26. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (10/14) DEVELOPER AND DESIGN SUMMIT Changes in Xen • Develop CPUFreq interface component Interface component performs following steps • Initialize everything needed for CPUFreq scaling driver to be functional inside Xen (SCPI protocol, mailbox, etc) • Register future CPUFreq scaling driver • Populate CPUs • Get DVFS info which is the OPP list and the latency information for all online DVFS capable CPUs using SCPI protocol • Convert these capabilities into performance states, rearrange it, as performance states must start from higher values • Upload the resulting PM data to CPUFreq framework Hardware domain doesn’t need to be involved (ACPI parser case on x86), since we already have everything in hand here in Xen.
  • 27. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (11/14) DEVELOPER AND DESIGN SUMMIT Changes in Xen • Develop CPUFreq scaling driver which acts as SCPI client Driver main responsibility is to signal OPP change request directly to SCP. Driver uses only three SCPI ops (three commands are sent to SCP) • dvfs_get_info • dvfs_set_idx • dvfs_get_idx Also driver needs to maintain cpufreq_table and care about matching the CPU frequency to OPP index and vise versa. Driver as well as interface components were developed in a platform agnostic way.
  • 28. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (12/14) DEVELOPER AND DESIGN SUMMIT Changes in ARM TF (final) • Add ability to physically set CPU OPP That’s all.
  • 29. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (13/14) DEVELOPER AND DESIGN SUMMIT The proposed solution “Xen + SCP” (SCPI based CPUFreq) is not limited by only using ARM TF providing SCPI services. If the ARM platform you are working with has a “dedicated IP core” already providing SCPI services then even better, the only one thing you need to do is an appropriate mailbox driver for the real mailbox HW to be able to communicate with SCP on your platform. And this mailbox driver is the only one platform dependent component. So, focus on your mailbox driver only. Important note Xen toolstack wasn’t modified. In order to minimize changes in common code and retain current ABI the SCPI based CPUFreq was made in a way to be absolutely OK with ACPI specific P-states. So, “xenpm” tool works out of the box on ARM.
  • 30. 2018 DEVELOPER AND DESIGN SUMMITCPUFreq PoC (14/14) DEVELOPER AND DESIGN SUMMIT Status This PoC works, it is stable enough, there is no crashes, freezes and other weird things. It is possible to control CPUFreq parameters via “xenpm” tool, this feature isn’t broken. I sent RFC patch series which initially implements “Xen+SCP” solution to the Xen ML in autumn 2017. Patch series consists of about 30 patches. Patch series looks quite huge at the first glance, it pulls a lot of verbatim code from Linux, but I believe that the number of lines of code it adds can be significantly reduced. Example of Xen boot log (XEN) scpi (XEN) scpi (XEN) cpu0 (XEN) cpu0 (XEN) cpu0 (XEN) cpu0 (XEN) cpu0 (XEN) cpu1 (XEN) cpu2 (XEN) cpu3 (XEN) cpu0 (XEN) cpu4 (XEN) cpu4 (XEN) cpu5 (XEN) cpu5 (XEN) cpu6 (XEN) cpu6 (XEN) cpu7 (XEN) cpu7 (XEN) initialized SCPI based CPUFreq Example of xenpm output root@generic-armv8-xt-dom0:~# xenpm get-cpufreq-para 0 cpu id affected_cpus cpuinfo frequency scaling_driver scaling_avail_gov current_governor ondemand specific sampling_rate up_threshold scaling_avail_freq scaling frequency turbo mode : 0 : 0 1 2 3 : max [1700000] min [500000] cur [500000] : scpi-cpufreq : userspace performance powersave ondemand : ondemand : : max [150000000] min [150000] cur [300000] : 80 : 1700000 1600000 1500000 1000000 *500000 : max [1700000] min [500000] cur [500000] : enabled : /mailbox@0: ARM SMC mailbox enabled with 2 chans. : /scpi: SCP Protocol 1.2 Firmware 1.0.0 version : is DVFS capable, belongs to pd0 : Turbo freq detected: 1700000 : Turbo Mode detected and enabled : Turbo freq detected: 1600000 : set Px states : set Px states : set Px states : set Px states : uploaded cpufreq data : failed to get clock node : isn't DVFS capable, skip it : failed to get clock node : isn't DVFS capable, skip it : failed to get clock node : isn't DVFS capable, skip it : failed to get clock node : isn't DVFS capable, skip it
  • 31. 2018 DEVELOPER AND DESIGN SUMMIT DEVELOPER AND DESIGN SUMMIT Renesas Salvator-X board with R-Car Gen3 H3 ES2.0 SoC • Four Cortex-A57 cores (DVFS capable) • Four Cortex-A53 cores • Cortex R7 core • IMG PowerVR Series6XT GPU • 4 GB RAM “Demo system” powered by Xen hypervisor was shown at CES 2018 • Xen hypervisor + CPUFreq patches • Thin domain 0 (generic ARMV8 Linux) • Cluster domain D (AGL + R-Car BSP) • Cloud services domain F (generic ARMV8 Linux) • IVI domain A (R-Car Android Auto) Important note The different frequencies for each CPU core are not allowed. Only one frequency for all CPU cores inside a cluster. Benchmarking results Setup used for benchmarking
  • 32. 2018 DEVELOPER AND DESIGN SUMMITDemo system Cloud services domain (generic armv8 Linux) IVI domain (R-Car Android Auto) Cluster domain (AGL + R-Car BSP) Thin domain 0 (generic armv8 Linux) Xen + CPUFreq patches ARM Trusted Firmware OP-TEE Pictures from: http://docs.automotivelinux.org/docs/getting_started/en/dev/reference/homescreen/ https://www.androidcentral.com/android-auto/
  • 33. 2018 DEVELOPER AND DESIGN SUMMIT Power consumption measurement (A57’s cluster only) 1. With DVFS CPUFReq settings: Ondemand governor, Turbo mode enabled, CPU OPPs: • 500 MHz (low) • 1000 MHz (low) • 1500 MHz (nom) • 1600 MHz (high) • 1700 MHz (high) 2. Without DVFS CPUFreq settings: Userspace governor, single CPU OPP – 1500 MHz (nom). Which is equivalent to what we actually have on bare system. Use cases (most are typical for Android) • Android home screen with static picture on display • Audio playback in Android using built-in Media player • Audio playback in Linux using AGL player • Provided by Kitchen Sink demo app “3D Cubes test” • Video playback using Youtube (SW decoding) • Navigation using Google Maps What to measure? DEVELOPER AND DESIGN SUMMIT
  • 34. 2018 How to measure? M E A S U R E M E N T T O O L Fluke 87 Multimeter (with AVG option) M E A S U R E M E N T S T E P S • Measure Voltage between 2 heads of shunt resistor • Calculate Current • Measure Voltage between first head of shunt resistor and ground • Calculate Power M E A S U R E M E N T S C H E M E DEVELOPER AND DESIGN SUMMIT
  • 35. 2018 DEVELOPER AND DESIGN SUMMIT U S E C A S E ( P D V F S / P N O M - 1 ) * 1 0 0 % R E M A R K S Home screen (background tasks) - 10,8 Low OPPs Audio playback (built-in Media player) - 12,2 ---//--- Audio playback (AGL player) - 13,6 ---//--- 3D Cubes test (Kitchen sink) - 11,9 ---//--- SW Video playback (youtube) - 9,8 ---//--- Navigation (google maps) + 27,3 High (turbo) OPPs Power consumption (A57’s cluster) Interesting fact: Audio playback consumes minimal CPU resources
  • 36. 2018 DEVELOPER AND DESIGN SUMMIT Usage OPP turbo (1.7 GHz) during booting reduces average boot time by 8% comparing to OPP nom (1.5 GHz). Expect more profit on M3 SoC which has higher OPP turbo (1.8 GHz). Boot time Low boot time is quite important in many areas...
  • 37. 2018 DEVELOPER AND DESIGN SUMMIT It was proposed by ARM guys in Xen ML. It is supposed to affect the whole CPUFreq subsystem in Xen. Currently the decision about frequency change is made by Xen exclusively. Xen scheduling vCPUs doesn’t know much about guest internals • All Xen sees are: trap on MMIO, hypercall, WFI • Linux guest can track the actual utilization of a vCPU, by keeping statistics of runnable processes and monitoring their time slice usage. But Xen doesn’t see this information Therefore Xen needs additional input from guests to make a decision on the proper frequency pCPU should run with. The idea is that guests could provide some input by signalling OPP change request up to the Xen. And Xen could then decide to act on it or not. Improvement (guest input) DEVELOPER AND DESIGN SUMMIT
  • 38. 2018 DEVELOPER AND DESIGN SUMMIT 1. Tell Xen about PM strategies to use for certain guests (via tools in Domain 0) Different guests should be treated differently • For RT guests - constant frequency • For entertainment - varying frequency, based on guest input May involve CPU pinning for certain class of guests. 2. Allow some guests (according to policy) to signal OPP change requests to Xen Xen takes those into account, though it may decide to not act immediately on it. 3. Have some way of actually realising certain OPPs Via an SCPI/SCMI client in Xen or in another way. Combined approach of CPUFreq in Xen DEVELOPER AND DESIGN SUMMIT
  • 39. 2018 DEVELOPER AND DESIGN SUMMITTwo possible options with guest input The first option has clear and straight logic. And it’s implementation would be simpler. XEN DOESN’T HAVE CPUFREQ LOGIC AT ALL • It doesn’t measure pCPU utilization • It collects OPP change requests from all allowed guests • It makes a decision based on these requests and some policies − Tell Xen (via tools in Domain 0) about static vCPU frequency settings which guest OPP change requests may or may not override • It sends final OPP change request to SCP XEN HAS CPUFREQ LOGIC • It measures pCPU utilization • In addition it can collect OPP change requests from all allowed guests • It makes a decision based on both: its own point of view and received guest requests − Is new governor needed for handling this? We may reuse the idea of x86’s APERF/MPERF: guest OPP change requests may be considered as SW performance counters • It sends final OPP change request to SCP 1 2
  • 40. 2018 DEVELOPER AND DESIGN SUMMIT Andre Przywara’s “SMC triggered mailbox” solution Use SMC mailbox for providing virtual SCPI services to guest in a generic, non-SoC specific way. (SMC mailbox binging allows using “HVC” calls to trigger services, so Xen could pick guest DVFS requests up and acts upon them). How it is supposed to work • Xen creates virtual SCPI DT nodes for guest, and use SMC mailbox with method “HVC” • Xen “HVC” handler then redirects guest DVFS requests to CPUFreq code Goals • No extra PV protocol • Platform agnostic for guests, while making all guest request ending up in Xen • Simple and clear flow How will guest signal OPP change request to Xen? DEVELOPER AND DESIGN SUMMIT
  • 41. 2018 DEVELOPER AND DESIGN SUMMIT My email: [email protected] My patch series for Xen: https://github.com/otyshchenko1/xen.git branch: cpufreq-devel-next My patch series for ARM TF: https://github.com/otyshchenko1/arm-trusted-firmware-1.git branch: scp-devel-next Useful links DEVELOPER AND DESIGN SUMMIT