Perf-Ninja
My Book
Challenges
Blog
Newsletter
About me
Denis Bakhvalov
11 Nov 2024 »
Book Updates and Errata. Performance Analysis and Tuning on Modern CPUs (Second Edition)
10 May 2024 »
Thread Count Scaling Part 5. Summary
10 May 2024 »
Thread Count Scaling Part 4. CloverLeaf and CPython
10 May 2024 »
Thread Count Scaling Part 3. Zstandard
10 May 2024 »
Thread Count Scaling Part 2. Blender and Clang
10 May 2024 »
Thread Count Scaling Part 1. Introduction
12 Feb 2024 »
Memory Profiling Part 5. Data Locality and Reuse Distances
12 Feb 2024 »
Memory Profiling Part 4. Memory Footprint Case Study
12 Feb 2024 »
Memory Profiling Part 3. Memory Footprint with SDE
12 Feb 2024 »
Memory Profiling Part 2. Memory Usage Case Study
12 Feb 2024 »
Memory Profiling Part 1. Introduction
17 Oct 2022 »
Four Cornerstones of CPU Performance.
01 Sep 2022 »
Performance Benefits of Using Huge Pages for Code.
11 May 2022 »
Visualizing Performance-Critical Dependency Chains.
17 Dec 2021 »
Highlights from Twitter Spaces discussions 2021.
24 Jan 2021 »
Machine Programming. What if computers would program themselves?
30 Dec 2020 »
Computing industry at the end of 2020 as I see it.
29 Nov 2020 »
Reflections on Writing a Book. Part 2.
28 Nov 2020 »
Reflections on Writing a Book. Part 1.
22 Nov 2020 »
Writing a Free Book From the Start.
24 Jun 2020 »
Draft of my perf book is ready!
01 Apr 2020 »
HW and SW rules of thumb.
26 Feb 2020 »
Guest post: COZ vs Sampling Profilers.
30 Dec 2019 »
Benchmarking: compare measurements and check which is faster.
17 Dec 2019 »
Detect false sharing with Data Address Profiling.
13 Dec 2019 »
How sport helps me to recharge.
27 Nov 2019 »
Data-Driven tuning. Specialize indirect call.
22 Nov 2019 »
Data-Driven tuning. Specialize switch with one hot case.
12 Oct 2019 »
How to find expensive locks in multithreaded application.
05 Oct 2019 »
Performance analysis of multithreaded applications.
13 Sep 2019 »
Intel Processor Trace Part4. Better profiling experience.
06 Sep 2019 »
Intel Processor Trace Part3. Analyzing performance glitches.
30 Aug 2019 »
Intel Processor Trace Part2. Better debugging experience.
23 Aug 2019 »
Enhance performance analysis with Intel Processor Trace.
02 Aug 2019 »
How to get consistent results when benchmarking on Linux?
26 Jul 2019 »
Developing intuition when working with performance counters.
06 May 2019 »
Estimating branch probability using Intel LBR feature.
03 Apr 2019 »
Precise timing of machine code with Linux perf.
27 Mar 2019 »
Machine code layout optimizations.
23 Feb 2019 »
How to collect CPU performance counters on Windows?
09 Feb 2019 »
Top-Down performance analysis methodology.
29 Dec 2018 »
Understanding IDQ_UOPS_NOT_DELIVERED performance counter.
08 Nov 2018 »
Using denormal values is slow. How to detect it?
12 Oct 2018 »
I am relocating to the US.
04 Sep 2018 »
Performance analysis vocabulary.
29 Aug 2018 »
Understanding performance events skid.
26 Aug 2018 »
Basics of profiling with perf.
09 Jul 2018 »
Improving performance by better code locality.
08 Jun 2018 »
Advanced profiling topics. PEBS and LBR.
01 Jun 2018 »
PMU counters and profiling basics.
03 May 2018 »
My learning resources in 2018
22 Apr 2018 »
What optimizations you can expect from CPU?
03 Apr 2018 »
Tools for microarchitectural benchmarking.
21 Mar 2018 »
Understanding CPU port contention.
12 Mar 2018 »
Embo 2018 trip report.
09 Mar 2018 »
Store forwarding by example.
23 Feb 2018 »
MacroFusion in Intel CPUs.
15 Feb 2018 »
MicroFusion in Intel CPUs.
04 Feb 2018 »
Microbenchmarking fused instruction.
25 Jan 2018 »
Code alignment options in llvm.
18 Jan 2018 »
Code alignment issues.
21 Nov 2017 »
Code::Dive 2017 trip report.
10 Nov 2017 »
Vectorization part7. Tips for writing vectorizable code.
09 Nov 2017 »
Vectorization part6. Multiversioning by trip counts.
03 Nov 2017 »
Vectorization part5. Multiversioning by data dependency.
02 Nov 2017 »
Vectorization part4. Vectorization Width.
30 Oct 2017 »
Vectorization part3. Compiler report.
27 Oct 2017 »
Vectorization part2. Warmup.
24 Oct 2017 »
Vectorization part1. Intro.
25 Nov 2016 »
Small size optimization.
21 Nov 2016 »
Sentinels.
20 Nov 2016 »
Code::Dive 2016 trip report.
Subscribe to get more updates from me:
Email Address
*
First Name
*
If you like this blog, support me on
Patreon
,
Github
, or by PayPal
donation
.
All content on
Easyperf
blog is licensed under a
Creative Commons Attribution 4.0 International License