| Mesa 20.0.0 Release Notes / 2020-02-19 |
| ====================================== |
| |
| Mesa 20.0.0 is a new development release. People who are concerned with |
| stability and reliability should stick with a previous release or wait |
| for Mesa 20.0.1. |
| |
| Mesa 20.0.0 implements the OpenGL 4.6 API, but the version reported by |
| glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / |
| glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being |
| used. Some drivers don't support all the features required in OpenGL |
| 4.6. OpenGL 4.6 is **only** available if requested at context creation. |
| Compatibility contexts may report a lower version depending on each |
| driver. |
| |
| Mesa 20.0.0 implements the Vulkan 1.2 API, but the version reported by |
| the apiVersion property of the VkPhysicalDeviceProperties struct depends |
| on the particular driver being used. |
| |
| SHA256 checksum |
| --------------- |
| |
| :: |
| |
| bb6db3e54b608d2536d4000b3de7dd3ae115fc114e8acbb5afff4b3bbed04b34 mesa-20.0.0.tar.xz |
| |
| New features |
| ------------ |
| |
| - OpenGL 4.6 on radeonsi. |
| - GL_ARB_gl_spirv on radeonsi. |
| - GL_ARB_spirv_extensions on radeonsi. |
| - GL_EXT_direct_state_access for compatibility profile. |
| - VK_AMD_device_coherent_memory on RADV. |
| - VK_AMD_mixed_attachment_samples on RADV. |
| - VK_AMD_shader_explicit_vertex_parameter on RADV. |
| - VK_AMD_shader_image_load_store_lod on RADV. |
| - VK_AMD_shader_fragment_mask on RADV. |
| - VK_EXT_subgroup_size_control on RADV/LLVM. |
| - VK_KHR_separate_depth_stencil_layouts on Intel, RADV. |
| - VK_KHR_shader_subgroup_extended_types on RADV. |
| - VK_KHR_swapchain_mutable_format on RADV. |
| - VK_KHR_shader_float_controls on RADV/ACO. |
| - GFX6 (Southern Islands) and GFX7 (Sea Islands) support on RADV/ACO. |
| - Wave32 support for GFX10 (Navi) on RADV/ACO. |
| - Compilation of Geometry Shaders on RADV/ACO. |
| - Vulkan 1.2 on Intel, RADV. |
| - GL_INTEL_shader_integer_functions2 and |
| VK_INTEL_shader_integer_functions2 on Intel. |
| |
| Bug fixes |
| --------- |
| |
| - drisw crashes on calling NULL putImage on EGL surfaceless platform |
| (pbuffer EGLSurface) |
| - [radeonsi][vaapi][bisected] invalid VASurfaceID when playing |
| interlaced DVB stream in Kodi |
| - [RADV] GPU hangs while the cutscene plays in the game Assassin's |
| Creed Origins |
| - ACO: The Elder Scrolls Online crashes on startup (Navi) |
| - Broken rendering of glxgears on S/390 architecture (64bit, BigEndian) |
| - aco: sun flickering with Assassins Creeds Origins |
| - !1896 broke ext_image_dma_buf_import piglit tests with radeonsi |
| - aco: wrong geometry with Assassins Creed Origins on GFX6 |
| - valgrind errors since commit a8ec4082a41 |
| - OSMesa osmesa_choose_format returns a format not supported by |
| st_new_renderbuffer_fb |
| - Build error with VS on WIN |
| - Using EGL_KHR_surfaceless_context causes spurious "libEGL warning: |
| FIXME: egl/x11 doesn't support front buffer rendering." |
| - !3460 broke texsubimage test with piglit on zink+anv |
| - The screen is black when using ACO |
| - [Regression] JavaFX unbounded VRAM+RAM usage |
| - radv: implement VK_AMD_shader_explicit_vertex_parameter |
| - Civilization VI crashes when loading game (AMD Vega Mobile) |
| - [radeonsi] X-Server crashes when trying to start Guild Wars 2 with |
| the commits from !3421 |
| - aco: implement GFX6 support |
| - Add support for VK_KHR_swapchain_mutable_format |
| - radv: The Surge 2 crashes in ac_get_elem_bits() |
| - [Regression] JavaFX unbounded VRAM+RAM usage |
| - Use the OpenCL dispatch defnitions from OpenCL_Headers |
| - [regression][ilk,g965,g45] various dEQP-GLES2.functional.shaders.\* |
| failures |
| - aco: Dead Rising 4 crashes in lower_to_hw_instr() on GFX6-GFX7 |
| - libvulkan_radeon.so crash with \`free(): double free detected in |
| tcache 2\` |
| - Commit be08e6a causes crash in com.android.launcher3 (Launcher) |
| - anv: Regression causing issues for radv when there are no Intel |
| devices |
| - Mesa no longer compiles with GCC 10 |
| - [Navi/aco] Guild Wars 2 - ring gfx timeout with commit 3bca0af2 |
| - [radv/aco] Regression is causing a soft crash in The Witcher 3 |
| - [bisected] [radeonsi] GPU hangs/resets while playing interlaced |
| content on Kodi with VAAPI |
| - [radeonsi] MSAA image not copied properly after image store through |
| texture view |
| - T-Rex and Manhattan onscreen performance issue on Android |
| - VkSamplerCreateInfo compareEnable not respected |
| - VkSamplerCreateInfo compareEnable not respected |
| - Freedreno drm softpin driver implementation leaks memory |
| - [POLARIS10] VRAM leak involving glTexImage2D with non-NULL data |
| argument |
| - [regression][bisected][ivb/byt] crucible test |
| func.push-constants.basic.q0 causes gpu hang |
| - MR 3096 broke lots of piglit ext_framebuffer_object tests on Raven |
| - Rise of the Tomb Raider benchmark crash on Dell XPS 7390 2-in-1 w/ |
| Iris Plus Graphics (Ice Lake 8x8 GT2) |
| - Raven Ridge (2400G): Resident Evil 2 crashes my machine |
| - Common practice of glGetActiveUniform leads to O(N²) behavior in Mesa |
| - Rocket League ingame artifacts |
| - [radv] SteamVR direct mode no longer works |
| - [ANV] unused create parameters not properly ignored |
| - [Bisected] Mesa fails to start alacritty with the wayland backend |
| (AMD Vega). |
| - [iris] piglit test clip-distance-vs-gs-out fails due to VUE map |
| mismatch between VS <-> GS stages |
| - [radv] SteamVR direct mode no longer works |
| - Blocky corruption in The Surge 2 |
| - radeonsi: Floating point exception on R9 270 gpu for a set of traces |
| - [RADV] [Navi] LOD artifacting in Halo - The Master Chief Collection |
| (Halo Reach) |
| - [CTS] |
| dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r32g32b32\_\* |
| fail on GFX6-GFX8 |
| - Vulkan: Please consider adding another sample count to |
| sampledImageIntegerSampleCounts |
| - Navi10: Bitrate based encoding with VAAPI/RadeonSI unusable |
| - [RADV] create parameters not properly ignored |
| - [regression][bdw,gen9,hsw,icl][iris] gltcs failures on |
| mesa=8172b1fa03f |
| - Bugs in RadeonSI VAAPI implementation |
| - [GFX10] Glitch rendering Custom Avatars in Beat Saber |
| - intel/fs: Check for 16-bit immediates in |
| fs_visitor::lower_mul_dword_inst is too strict |
| - i965/iris: assert when destroy GL context with active query |
| - Visuals without alpha bits are not sRGB-capable |
| - swapchain throttling: wait for fence has 1ns timeout |
| - radeonsi: OpenGL app always produces page fault in gfxhub on Navi 10 |
| - [regression] |
| KHR-GLES31.core.geometry_shader.api.program_pipeline_vs_gs_capture |
| fails for various drivers |
| - [CTS] |
| dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point |
| hangs on GFX10 |
| - [RADV] SPIR-V warning when compiling shader using storage |
| multisampled image array |
| - [RADV] The Dead Rising 4 is causing a GPU hang with LLVM backend |
| - macOS u_thread.h:156:4: error: implicit declaration of function |
| 'pthread_getcpuclockid' |
| - [Wine / Vulkan] Doom 2016 Hangs on Main Menu |
| - NULL resource when playing VP9 video through VDPAU on RX 570 |
| - radeonsi: mpv --vo=vaapi incorrect rendering on gfx9+ |
| - [BSW/BDW] skia lcdblendmode & lcdoverlap test failure |
| - Create a way to prefer iris vs i965 via driconf |
| - [Bisected] i965: CS:GO crashes in emit_deref_copy_load_store with |
| debug Mesa |
| - radv/aco Jedi Fallen Order hair rendering buggy |
| - Inaccurate information on https://www.mesa3d.org/repository.html |
| about how to get git write access. |
| - [RADV] VK_KHR_timeline_semaphore balloons in runtime |
| - Shadow of Mordor has randomly dancing black shadows on Talion's face |
| - gen7 crucible failures func.push-constants.basic.q0 and |
| func.shader-subgroup-vote.basic.q0 |
| - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM |
| - Unreal 4 Elemental and MatineeFightScene demos misrender |
| - gputest gimark has unwanted black liquorice flakes |
| - triangle strip clipping with GL_FIRST_VERTEX_CONVENTION causes wrong |
| vertex's attribute to be broadcasted for flat interpolation |
| - [bisected][regression][g45,g965,ilk] piglit arb_fragment_program kil |
| failures |
| - glcts crashes since the enablement of ARB_shading_language_include |
| - Android build broken |
| - ld.lld: error: duplicate symbol (mesa-19.3.0-rc1) |
| - Divinity: Original Sin Enhanced Edition(Native) crash on start |
| - HSW. Tropico 6 and SuperTuxKart have shadows flickering |
| - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM |
| - glxgears segfaults on POWER / Xvnc |
| - [regression][bdw,gen9,icl][iris] piglit failures on mesa |
| f9fd04aca15fd00889caa666ba38007268e67f5c |
| - Redundant builds of libmesa_classic and libmesa_gallium |
| - [IVB,BYT] [Regression] [Bisected] Core dump at launching |
| arb_compute_shader/linker/bug-93840.shader_test |
| - Vulkan drivers need access to format utils of gallium |
| - Disabling lower_fragdata_array causes shader-db to crash for some |
| drivers |
| - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM |
| - Android build broken by commit 9020f51 "util/u_endian: Add error |
| checks" |
| - radv secure compile feature breaks compilation of RADV on armhf EABI |
| (19.3-rc1) |
| - radv_debug.c warnings when compiling on 32 bits : cast to pointer |
| from integer of different size |
| - Meson: Mesa3D build failure with standalone Mingw-w64 multilib |
| - [regression][bisected] KHR46 VertexArrayAttribFormat has unexpectedly |
| generated GL_INVALID_OPERATION |
| - textureSize(samplerExternalOES, int) missing in desktop mesa 19.1.7 |
| implementation |
| - zink: implicly casting integers to pointers, warnings on 32-bit |
| compile |
| - Objects leaving trails in Firefox with antialias and |
| preserveDrawingBuffer in three.js WebGLRednerer with mesa 19.2 |
| |
| Changes |
| ------- |
| |
| Aaron Watry (1): |
| |
| - clover/llvm: fix build after llvm 10 commit 1dfede3122ee |
| |
| Adam Jackson (1): |
| |
| - drisw: Cache the depth of the X drawable |
| |
| Afonso Bordado (4): |
| |
| - pan/midgard: Optimize comparisions with similar operations |
| - pan/midgard: Move midgard_is_branch_unit to helpers |
| - pan/midgard: Optimize branches with inverted arguments |
| - pan/midgard: Fix midgard_compile.h includes |
| |
| Alan Coopersmith (1): |
| |
| - intel/perf: adapt to platforms like Solaris without d_type in struct |
| dirent |
| |
| Alejandro Piñeiro (4): |
| |
| - v3d: adds an extra MOV for any sig.ld\* |
| - mesa/main/util: moving gallium u_mm to util, remove main/mm |
| - nir/opt_peephole_select: remove unused variables |
| - turnip: remove unused descriptor state dirty |
| |
| Alexander van der Grinten (1): |
| |
| - egl: Fix \_eglPointerIsDereferencable w/o mincore() |
| |
| Alexander von Gluck IV (1): |
| |
| - haiku/hgl: Fix build via header reordering |
| |
| Alyssa Rosenzweig (223): |
| |
| - pipe-loader: Build kmsro loader for with all kmsro targets |
| - pan/midgard: Remove OP_IS_STORE_VARY |
| - pan/midgard: Add a dummy source for loads |
| - pan/midgard: Refactor swizzles |
| - pan/midgard: Eliminate blank_alu_src |
| - pan/midgard: Use fp32 blend shaders |
| - pan/midgard: Validate tags when branching |
| - pan/midgard: Fix quadword_count handling |
| - pan/midgard: Compute bundle interference |
| - pan/midgard: Add bizarre corner case |
| - pan/midgard: offset_swizzle doesn't need dstsize |
| - pan/midgard: Extend offset_swizzle to non-32-bit |
| - pan/midgard: Extend swizzle packing for vec4/16-bit |
| - pan/midgard: Extend default_phys_reg to !32-bit |
| - panfrost/ci: Update T760 expectations |
| - pan/midgard: Fix printing of half-registers in texture ops |
| - pan/midgard: Disassemble half-steps correctly |
| - pan/midgard: Pass shader stage to disassembler |
| - pan/midgard: Switch base for vertex texturing on T720 |
| - nir: Add load_output_u8_as_fp16_pan intrinsic |
| - pan/midgard: Identify ld_color_buffer_u8_as_fp16\* |
| - pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan |
| - pan/midgard: Pack load/store masks |
| - panfrost: Select format-specific blending intrinsics |
| - pan/midgard: Add blend shader selection bits for MRT |
| - pan/midgard: Implement linearly-constrained register allocation |
| - pan/midgard: Integrate LCRA |
| - pan/midgard: Remove util/ra support |
| - pan/midgard: Compute spill costs |
| - pan/lcra: Use Chaitin's spilling heuristic |
| - pan/midgard: Copypropagate vector creation |
| - pan/midgard: Fix copypropagation for textures |
| - pan/midgard: Generalize texture registers across GPUs |
| - pan/midgard: Fix vertex texturing on early Midgard |
| - pan/midgard: Use texture, not textureLod, on early Midgard |
| - pan/midgard: Disassemble with old pipeline always on T720 |
| - pan/midgard: Prioritize texture registers |
| - pan/midgard: Expand 64-bit writemasks |
| - pan/midgard: Implement i2i64 and u2u64 |
| - pan/midgard: Fix mir_round_bytemask_down for !32b |
| - pan/midgard: Pack 64-bit swizzles |
| - pan/midgard: Use generic constant packing for 8/64-bit |
| - pan/midgard: Implement non-aligned UBOs |
| - pan/midgard: Expose more typesize helpers |
| - pan/midgard: Fix masks/alignment for 64-bit loads |
| - pan/midgard: Represent ld/st offset unpacked |
| - pan/midgard: Use shader stage in mir_op_computes_derivative |
| - panfrost: Stub out clover callbacks |
| - panfrost: Pass kernel inputs as uniforms |
| - panfrost: Disable tiling for GLOBAL resources |
| - panfrost: Set PIPE_COMPUTE_CAP_ADDRESS_BITS to 64 |
| - pan/midgard: Introduce quirks checks |
| - panfrost: Add the lod_bias field |
| - nir: Add load_sampler_lod_paramaters_pan intrinsic |
| - pan/midgard: Implement load_sampler_lod_paramaters_pan |
| - pan/midgard: Add LOD bias/clamp lowering |
| - pan/midgard: Describe quirk MIDGARD_BROKEN_LOD |
| - pan/midgard: Enable LOD lowering only on buggy chips |
| - panfrost: Add lcra.c to Android.mk |
| - pan/midgard: Use lower_tex_without_implicit_lod |
| - panfrost: Add information about T720 tiling |
| - panfrost: Implement pan_tiler for non-hierarchy GPUs |
| - panfrost: Simplify draw_flags |
| - pan/midgard: Splatter on fragment out |
| - gitlab-ci: Remove non-default skips from Panfrost |
| - panfrost: Remove blend shader hack |
| - panfrost: Update SET_VALUE with information from igt |
| - panfrost: Rename SET_VALUE to WRITE_VALUE |
| - gallium/util: Support POLYGON in u_stream_outputs_for_vertices |
| - pan/midgard: Move spilling code out of scheduler |
| - pan/midgard: Split spill node selection/spilling |
| - pan/midgard: Simplify spillability test |
| - pan/midgard: Remove spill cost heuristic |
| - pan/midgard: Move bounds checking into LCRA |
| - pan/midgard: Remove consecutive_skip code |
| - pan/midgard: Remove code marked "TODO: remove me" |
| - pan/midgard: Dynamically allocate r26/27 for spills |
| - pan/midgard: Use no_spill bitmask |
| - pan/midgard: Don't use no_spill for memory spill src |
| - pan/midgard: Force alignment for csel_v |
| - pan/midgard: Don't try to free NULL in LCRA |
| - pan/midgard: Simplify and fix vector copyprop |
| - pan/midgard: Fix shift for TLS access |
| - panfrost: Describe thread local storage sizing rules |
| - panfrost: Rename unknown_address_0 -> scratchpad |
| - panfrost: Split stack_shift nibble from unk0 |
| - panfrost: Add routines to calculate stack size/shift |
| - panfrost: Factor out panfrost_query_raw |
| - panfrost: Query core count and thread tls alloc |
| - panfrost: Route stack_size from compiler |
| - panfrost: Emit SFBD/MFBD after a batch, instead of before |
| - panfrost: Handle minor cppcheck issues |
| - pan/midgard: Remove unused ld/st packing hepers |
| - pan/midgard: Handle misc. cppcheck warnings |
| - panfrost: Calculate maximum stack_size per batch |
| - panfrost: Pass size to panfrost_batch_get_scratchpad |
| - pandecode: Add cast |
| - panfrost: Move nir_undef_to_zero to Midgard compiler |
| - panfrost: Move property queries to \_encoder |
| - panfrost: Add panfrost_model_name helper |
| - panfrost: Report GPU name in es2_info |
| - ci: Remove T760/T860 from CI temporarily |
| - panfrost: Pass blend RT number through |
| - pan/midgard: Add schedule barrier after fragment writeout |
| - pan/midgard: Writeout per render target |
| - pan/midgard: Fix liveness analysis with multiple epilogues |
| - pan/midgard: Set r1.w magic |
| - panfrost: Fix FBD issue |
| - ci: Reinstate Panfrost CI |
| - panfrost: Remove fbd_type enum |
| - panfrost: Pack invocation_shifts manually instead of a bit field |
| - panfrost: Remove asserts in panfrost_pack_work_groups_compute |
| - panfrost: Simplify sampler upload condition |
| - panfrost: Don't double-create scratchpad |
| - panfrost: Add PAN_MESA_DEBUG=precompile for shader-db |
| - panfrost: Let precompile imply shaderdb |
| - panfrost: Handle empty shaders |
| - pan/midgard: Use a reg temporary for mutiple writes |
| - pan/midgard: Hoist temporary coordinate for cubemaps |
| - pan/midgard: Set .shadow for shadow samplers |
| - pan/midgard: Set Z to shadow comparator for 2D |
| - pan/midgard: Add uniform/work heuristic |
| - pan/midgard: Implement textureOffset for 2D textures |
| - pan/midgard: Fix crash with txs |
| - pan/midgard: Lower txd with lower_tex |
| - panfrost: Decode shader types in pantrace shader-db |
| - pan/decode: Skip COMPUTE in blobber-db |
| - pan/decode: Prefix blobberdb with MESA_SHADER\_\* |
| - pan/decode: Append 0:0 spills:fills to blobber-db |
| - pan/midgard: Fix disassembler cycle/quadword counting |
| - pan/midgard: Bounds check lcra_restrict_range |
| - pan/midgard: Extend IS_VEC4_ONLY to arguments |
| - pan/midgard: Clamp LOD register swizzle |
| - pan/midgard: Expand swizzle for texelFetch |
| - pan/midgard: Fix fallthrough from offset to comparator |
| - pan/midgard: Do witchcraft on texture offsets |
| - pan/midgard: Generalize temp coordinate to non-2D |
| - pan/midgard: Implement shadow cubemaps |
| - pan/midgard: Enable lower_(un)pack\_\* lowering |
| - pan/midgard: Support loads from R11G11B10 in a blend shader |
| - pan/midgard: Add mir_upper_override helper |
| - pan/midgard: Compute destination override |
| - panfrost: Rename pan_instancing.c -> pan_attributes.c |
| - panfrost: Factor batch/resource out of instancing routines |
| - panfrost: Move instancing routines to encoder/ |
| - panfrost: Factor out panfrost_compute_magic_divisor |
| - panfrost: Fix off-by-one in pan_invocation.c |
| - pan/decode: Fix reference computation for invocations |
| - panfrost: Slight cleanup of Gallium's pan_attribute.c |
| - panfrost: Remove pan_shift_odd |
| - pan/decode: Handle gl_VertexID/gl_InstanceID |
| - panfrost: Unset vertex_id_zero_based |
| - pan/midgard: Factor out emit_attr_read |
| - pan/midgard: Lower gl_VertexID/gl_InstanceID to attributes |
| - panfrost: Extend attribute_count for vertex builtins |
| - panfrost: Route gl_VertexID through cmdstream |
| - pan/midgard: Fix minor typo |
| - panfrost: Remove MALI_SPECIAL_ATTRIBUTE_BASE defines |
| - panfrost: Update information on fixed attributes/varyings |
| - panfrost: Remove MALI_ATTR_INTERNAL |
| - panfrost: Inline away MALI_NEGATIVE |
| - panfrost: Implement remaining texture wrap modes |
| - panfrost: Add pan_attributes.c to Android.mk |
| - panfrost: Add missing #include in common header |
| - panfrost: Remove mali_alt_func |
| - panfrost; Update comment about work/uniform_count |
| - panfrost: Remove 32-bit next_job path |
| - glsl: Set .flat for gl_FrontFacing |
| - pan/midgard: Promote tilebuffer reads to 32-bit |
| - pan/midgard: Use type-appropriate st_vary |
| - pan/midgard: Implement flat shading |
| - panfrost: Identify glProvokingVertex flag |
| - panfrost: Disable some CAPs we want lowered |
| - panfrost: Implement integer varyings |
| - panfrost: Remove MRT indirection in blend shaders |
| - panfrost: Respect glPointSize() |
| - pan/midgard: Convert fragment writeout to proper branches |
| - pan/midgard: Remove prepacked_branch |
| - panfrost: Handle RGB16F colour clear |
| - panfrost: Pack MRT blend shaders into a single BO |
| - pan/midgard: Fix memory corruption in constant combining |
| - pan/midgard: Use better heuristic for shader termination |
| - pan/midgard: Generalize IS_ALU and quadword_size |
| - pan/midgard: Generate MRT writeout loops |
| - pan/midgard: Remove old comment |
| - pan/midgard: Identity ld_color_buffer as 32-bit |
| - pan/midgard: Use upper ALU tags for MFBD writeout |
| - panfrost: Texture from Z32F_S8 as R32F |
| - panfrost: Support rendering to non-zero Z/S layers |
| - panfrost: Implement sRGB blend shaders |
| - panfrost: Cleanup tiling selection logic |
| - panfrost: Report MSAA 4x supported for dEQP |
| - panfrost: Handle PIPE_FORMAT_R10G10B10A2_USCALED |
| - panfrost: Respect constant buffer_offset |
| - panfrost: Adjust for mismatch between hardware/Gallium in arrays/cube |
| - pan/midgard: Account for z/w flip in texelFetch |
| - panfrost: Don't double-flip Z/W for 2D arrays |
| - pan/midgard: Support indirect UBO offsets |
| - panfrost: Fix linear depth textures |
| - pan/midgard: Bytemasks should round up, not round down |
| - panfrost: Identify un/pack colour opcodes |
| - pan/midgard: Fix recursive csel scheduling |
| - panfrost: Expose some functionality with dEQP flag |
| - panfrost: Compile tiling routines with -O3 |
| - panfrost,lima: De-Galliumize tiling routines |
| - panfrost: Rework linear<--->tiled conversions |
| - panfrost: Add pandecode entries for ASTC/ETC formats |
| - panfrost: Fix crash in compute variant allocation |
| - panfrost: Drop mysterious zero=0xFFFF field |
| - panfrost: Don't use implicit mali_exception_status enum |
| - pan/decode: Remove last_size |
| - pan/midgard: Remove pack_color define |
| - pan/decode: Remove SHORT_SLIDE indirection |
| - panfrost: Fix 32-bit warning for \`indices\` |
| - pan/decode: Drop MFBD compute shader stuff |
| - pan/midgard: Record TEXTURE_OP_BARRIER |
| - pan/midgard: Disassemble barrier instructions |
| - pan/midgard: Validate barriers use a barrier tag |
| - pan/midgard: Handle tag 0x4 as texture |
| - pan/midgard: Remove float_bitcast |
| - pan/midgard: Fix missing prefixes |
| - pan/midgard: Don't crash with constants on unknown ops |
| - pan/midgard: Use fprintf instead of printf for constants |
| |
| Andreas Baierl (14): |
| |
| - lima: Beautify stream dumps |
| - lima: Parse VS and PLBU command stream while making a dump |
| - lima/streamparser: Fix typo in vs semaphore parser |
| - lima/streamparser: Add findings introduced with gl_PointSize |
| - lima/parser: Some fixes and cleanups |
| - lima/parser: Add RSW parsing |
| - lima/parser: Add texture descriptor parser |
| - lima: Rotate dump files after each finished pp frame |
| - lima: Fix dump file creation |
| - lima/parser: Fix rsw parser |
| - lima/parser: Fix VS cmd stream parser |
| - lima/parser: Make rsw alpha blend parsing more readable |
| - lima: Add stencil support |
| - lima: Fix alpha blending |
| |
| Andres Rodriguez (1): |
| |
| - vulkan/wsi: disable the hardware cursor |
| |
| Andrii Simiklit (5): |
| |
| - main: fix several 'may be used uninitialized' warnings |
| - glsl: fix an incorrect max_array_access after optimization of |
| ssbo/ubo |
| - glsl: fix a binding points assignment for ssbo/ubo arrays |
| - glsl/nir: do not change an element index to have correct block name |
| - mesa/st: fix a memory leak in get_version |
| |
| Anthony Pesch (5): |
| |
| - util: import xxhash |
| - util: move fnv1a hash implementation into its own header |
| - util/hash_table: replace \_mesa_hash_data's fnv1a hash function with |
| xxhash |
| - util/hash_table: added hash functions for integer types |
| - util/hash_table: update users to use new optimal integer hash |
| functions |
| |
| Anuj Phogat (2): |
| |
| - intel: Add device info for 1x4x6 Jasper Lake |
| - intel: Add pci-ids for Jasper Lake |
| |
| Arno Messiaen (5): |
| |
| - lima: fix stride in texture descriptor |
| - lima: add layer_stride field to lima_resource struct |
| - lima: introduce ppir_op_load_coords_reg to differentiate between |
| loading texture coordinates straight from a varying vs loading them |
| from a register |
| - lima: add cubemap support |
| - lima/ppir: add lod-bias support |
| |
| Bas Nieuwenhuizen (33): |
| |
| - radv: Fix timeout handling in syncobj wait. |
| - radv: Remove \_mesa_locale_init/fini calls. |
| - turnip: Remove \_mesa_locale_init/fini calls. |
| - anv: Remove \_mesa_locale_init/fini calls. |
| - radv: Fix disk_cache_get size argument. |
| - radv: Close all unnecessary fds in secure compile. |
| - radv: Do not change scratch settings while shaders are active. |
| - radv: Allocate cmdbuffer space for buffer marker write. |
| - radv: Enable VK_KHR_buffer_device_address. |
| - amd/llvm: Refactor ac_build_scan. |
| - radv: Unify max_descriptor_set_size. |
| - radv: Fix timeline semaphore refcounting. |
| - radv: Fix RGBX Android<->Vulkan format correspondence. |
| - amd/common: Fix tcCompatible degradation on Stoney. |
| - amd/common: Always use addrlib for HTILE tc-compat. |
| - radv: Limit workgroup size to 1024. |
| - radv: Expose all sample counts for integer formats as well. |
| - amd/common: Handle alignment of 96-bit formats. |
| - nir: Add clone/hash/serialize support for non-uniform tex |
| instructions. |
| - nir: print non-uniform tex fields. |
| - amd/common: Always initialize gfx9 mipmap offset/pitch. |
| - turnip: Use VK_NULL_HANDLE instead of NULL. |
| - meson: Enable -Werror=int-conversion. |
| - Revert "amd/common: Always initialize gfx9 mipmap offset/pitch." |
| - radv: Only use the gfx mipmap level offset/pitch for linear textures. |
| - spirv: Fix glsl type assert in spir2nir. |
| - radv: Emit a BATCH_BREAK when changing pixel shaders or |
| CB_TARGET_MASK. |
| - radv: Use new scanout gfx9 metadata flag. |
| - radv: Disable VK_EXT_sample_locations on GFX10. |
| - radv: Remove syncobj_handle variable in header. |
| - radv: Expose VK_KHR_swapchain_mutable_format. |
| - radv: Allow DCC & TC-compat HTILE with |
| VK_IMAGE_CREATE_EXTENDED_USAGE_BIT. |
| - radv: Do not set SX DISABLE bits for RB+ with unused surfaces. |
| |
| Ben Crocker (1): |
| |
| - llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders |
| |
| Bernd Kuhls (1): |
| |
| - util/os_socket: Include unistd.h to fix build error |
| |
| Boris Brezillon (21): |
| |
| - panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASK |
| - panfrost: Destroy the upload manager allocated in |
| panfrost_create_context() |
| - panfrost: Release the ctx->pipe_framebuffer ref |
| - panfrost: Move BO cache related fields to a sub-struct |
| - panfrost: Try to evict unused BOs from the cache |
| - gallium: Fix the ->set_damage_region() implementation |
| - panfrost: Make sure we reset the damage region of RTs at flush time |
| - panfrost: Remove unneeded phi nodes |
| - panfrost/midgard: Fix swizzle for store instructions |
| - panfrost/midgard: Print the actual source register for store |
| operations |
| - panfrost/midgard: Use a union to manipulate embedded constants |
| - panfrost/midgard: Rework mir_adjust_constants() to make it type/size |
| agnostic |
| - panfrost/midgard: Make sure promote_fmov() only promotes 32-bit imovs |
| - panfrost/midgard: Factorize f2f and u2u handling |
| - panfrost/midgard: Add f2f64 support |
| - panfrost/midgard: Fix mir_print_instruction() for branch instructions |
| - panfrost/midgard: Add 64 bits float <-> int converters |
| - panfrost/midgard: Add missing lowering passes for type/size |
| conversion ops |
| - panfrost/midgard: Add a condense_writemask() helper |
| - panfrost/midgard: Prettify embedded constant prints |
| - panfrost: Fix the damage box clamping logic |
| |
| Brian Ho (14): |
| |
| - turnip: Update tu_query_pool with turnip-specific fields |
| - turnip: Implement vkCreateQueryPool for occlusion queries |
| - turnip: Implement vkCmdBeginQuery for occlusion queries |
| - turnip: Implement vkCmdEndQuery for occlusion queries |
| - turnip: Update query availability on render pass end |
| - turnip: Implement vkGetQueryPoolResults for occlusion queries |
| - turnip: Implement vkCmdResetQueryPool |
| - turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries |
| - anv: Properly fetch partial results in vkGetQueryPoolResults |
| - anv: Handle unavailable queries in vkCmdCopyQueryPoolResults |
| - turnip: Enable occlusionQueryPrecise |
| - turnip: Free event->bo on vkDestroyEvent |
| - turnip: Fix vkGetQueryPoolResults with available flag |
| - turnip: Fix vkCmdCopyQueryPoolResults with available flag |
| |
| Brian Paul (4): |
| |
| - s/APIENTRY/GLAPIENTRY/ in teximage.c |
| - nir: fix a couple signed/unsigned comparison warnings in |
| nir_builder.h |
| - Call shmget() with permission 0600 instead of 0777 |
| - nir: no-op C99 \_Pragma() with MSVC |
| |
| C Stout (1): |
| |
| - util/vector: Fix u_vector_foreach when head rolls over |
| |
| Caio Marcelo de Oliveira Filho (24): |
| |
| - spirv: Don't leak GS initialization to other stages |
| - glsl: Check earlier for MaxShaderStorageBlocks and MaxUniformBlocks |
| - glsl: Check earlier for MaxTextureImageUnits and MaxImageUniforms |
| - anv: Initialize depth_bounds_test_enable when not explicitly set |
| - spirv: Consider the sampled_image case in wa_glslang_179 workaround |
| - intel/fs: Lower 64-bit MOVs after lower_load_payload() |
| - intel/fs: Fix lowering of dword multiplication by 16-bit constant |
| - intel/vec4: Fix lowering of multiplication by 16-bit constant |
| - anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT) |
| - spirv: Implement SPV_KHR_non_semantic_info |
| - panfrost: Fix Makefile.sources |
| - anv: Drop unused function parameter |
| - anv: Ignore some CreateInfo structs when rasterization is disabled |
| - intel/fs: Only use SLM fence in compute shaders |
| - spirv: Drop EXT for PhysicalStorageBuffer symbols |
| - spirv: Handle PhysicalStorageBuffer in memory barriers |
| - nir: Add missing nir_var_mem_global to various passes |
| - intel/fs: Add FS_OPCODE_SCHEDULING_FENCE |
| - intel/fs: Add workgroup_size() helper |
| - intel/fs: Don't emit fence for shared memory if only one thread is |
| used |
| - intel/fs: Don't emit control barrier if only one thread is used |
| - anv: Always initialize target_stencil_layout |
| - intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT |
| - nir: Make nir_deref_path_init skip trivial casts |
| |
| Chris Wilson (1): |
| |
| - egl: Mention if swrast is being forced |
| |
| Christian Gmeiner (24): |
| |
| - drm-shim: fix EOF case |
| - etnaviv: rs: upsampling is not supported |
| - etnaviv: add drm-shim |
| - etnaviv: drop not used config_out function param |
| - etnaviv: use a more self-explanatory param name |
| - etnaviv: handle 8 byte block in tiling |
| - etnaviv: add support for extended pe formats |
| - etnaviv: fix integer vertex formats |
| - etnaviv: use NORMALIZE_SIGN_EXTEND |
| - etnaviv: fix R10G10B10A2 vertex format entries |
| - etnaviv: handle integer case for GENERIC_ATTRIB_SCALE |
| - etnaviv: remove dead code |
| - etnaviv: remove not used etna_bits_ones(..) |
| - etnaviv: drop compiled_rs_state forward declaration |
| - etnaviv: update resource status after flushing |
| - gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS |
| - etnaviv: check if MSAA is supported |
| - etnaviv: gc400 does not support any vertex sampler |
| - etnaviv: use a better name for FE_VERTEX_STREAM_UNK14680 |
| - etnaviv: move state based texture structs |
| - etnaviv: move descriptor based texture structs |
| - etnaviv: add deqp debug option |
| - etnaviv: drop default state for PE_STENCIL_CONFIG_EXT2 |
| - etnaviv: drm-shim: add GC400 |
| |
| Connor Abbott (19): |
| |
| - nir: Fix non-determinism in lower_global_vars_to_local |
| - radv: Rename ac_arg_regfile |
| - ac: Add a shared interface between radv, radeonsi, LLVM and ACO |
| - ac/nir, radv, radeonsi: Switch to using ac_shader_args |
| - radv: Move argument declaration out of nir_to_llvm |
| - aco: Constify radv_nir_compiler_options in isel |
| - aco: Use radv_shader_args in aco_compile_shader() |
| - aco: Split vector arguments at the beginning |
| - aco: Make num_workgroups and local_invocation_ids one argument each |
| - radv: Replace supports_spill with explict_scratch_args |
| - aco: Use common argument handling |
| - aco: Make unused workgroup id's 0 |
| - nir: Maintain the algebraic automaton's state as we work. |
| - a6xx: Add more CP packets |
| - freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE |
| - freedreno: Fix CP_MEM_TO_REG flag definitions |
| - freedreno: Document CP_COND_REG_EXEC more |
| - freedreno: Document CP_UNK_A6XX_55 |
| - freedreno: Document CP_INDIRECT_BUFFER_CHAIN |
| |
| Daniel Ogorchock (2): |
| |
| - panfrost: Fix panfrost_bo_access memory leak |
| - panfrost: Fix headers and gpu_headers memory leak |
| |
| Daniel Schürmann (58): |
| |
| - aco: fix immediate offset for spills if scratch is used |
| - aco: only use single-dword loads/stores for spilling |
| - aco: fix accidential reordering of instructions when scheduling |
| - aco: workaround Tonga/Iceland hardware bug |
| - aco: fix invalid access on Pseudo_instructions |
| - aco: preserve kill flag on moved operands during RA |
| - aco: rematerialize s_movk instructions |
| - aco: check if SALU instructions are predeceeded by exec when |
| calculating WQM needs |
| - aco: value number instructions using the execution mask |
| - aco: use s_and_b64 exec to reduce uniform booleans to one bit |
| - amd/llvm: Add Subgroup Scan functions for SI |
| - radv: Enable Subgroup Arithmetic and Clustered for SI |
| - aco: don't value-number instructions from within a loop with ones |
| after the loop. |
| - aco: don't split live-ranges of linear VGPRs |
| - aco: fix a couple of value numbering issues |
| - aco: refactor visit_store_fs_output() to use the Builder |
| - aco: Initial GFX7 Support |
| - aco: SI/CI - fix sampler aniso |
| - aco: fix SMEM offsets for SI/CI |
| - aco: implement nir_op_fquantize2f16 for SI/CI |
| - aco: only use scalar loads for readonly buffers on SI/CI |
| - aco: implement nir_op_isign on SI/CI |
| - aco: move buffer_store data to VGPR if needed |
| - aco: implement quad swizzles for SI/CI |
| - aco: recognize SI/CI SMRD hazards |
| - aco: fix disassembly of writelane instructions. |
| - aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI |
| - aco: implement 64bit VGPR shifts for SI/CI |
| - aco: make 1/2*PI a literal constant on SI/CI |
| - aco: implement 64bit i2b for SI /CI |
| - aco: implement 64bit ine/ieq for SI/CI |
| - aco: disable disassembly for SI/CI due to lack of support by LLVM |
| - radv: only flush scalar cache for SSBO writes with ACO on GFX8+ |
| - aco: flush denorms after fmin/fmax on pre-GFX9 |
| - aco: don't use a scalar temporary for reductions on GFX10 |
| - aco: implement (clustered) reductions for SI/CI |
| - aco: implement inclusive_scan for SI/CI |
| - aco: implement exclusive scan for SI/CI |
| - radv: disable Youngblood app profile if ACO is used |
| - aco: return to loop_active mask at continue_or_break blocks |
| - radv: Enable ACO on GFX7 (Sea Islands) |
| - aco: use soffset for MUBUF instructions on SI/CI |
| - aco: improve readfirstlane after uniform ssbo loads on GFX7 |
| - aco: propagate temporaries into expanded vectors |
| - nir: fix printing of var_decl with more than 4 components. |
| - aco: compact various Instruction classes |
| - aco: compact aco::span<T> to use uint16_t offset and size instead of |
| pointer and size_t. |
| - aco: fix unconditional demote_to_helper |
| - aco: rework lower_to_cssa() |
| - aco: handle phi affinities transitively through parallelcopies |
| - aco: ignore parallelcopies to the same register on jump threading |
| - aco: fix combine_salu_not_bitwise() when SCC is used |
| - aco: reorder VMEM operands in ACO IR |
| - aco: fix register allocation with multiple live-range splits |
| - aco: simplify adjust_sample_index_using_fmask() & get_image_coords() |
| - aco: simplify gathering of MIMG address components |
| - docs: add new features for RADV/ACO. |
| - aco: fix image_atomic_cmp_swap |
| |
| Daniel Stone (2): |
| |
| - Revert "st/dri: do FLUSH_VERTICES before calling flush_resource" |
| - Revert "gallium: add st_context_iface::flush_resource to call |
| FLUSH_VERTICES" |
| |
| Danylo Piliaiev (12): |
| |
| - intel/blorp: Fix usage of uninitialized memory in key hashing |
| - i965/program_cache: Lift restriction on shader key size |
| - intel/blorp: Fix usage of uninitialized memory in key hashing |
| - intel/fs: Do not lower large local arrays to scratch on gen7 |
| - i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround |
| - glsl: Add varyings to "zero-init of uninitialized vars" workaround |
| - drirc: Add glsl_zero_init workaround for GpuTest |
| - iris/query: Implement PIPE_QUERY_GPU_FINISHED |
| - iris: Fix value of out-of-bounds accesses for vertex attributes |
| - i965: Do not set front_buffer_dirty if there is no front buffer |
| - st/mesa: Handle the rest renderbuffer formats from OSMesa |
| - st/nir: Unify inputs_read/outputs_written before serializing NIR |
| |
| Dave Airlie (74): |
| |
| - nir/serialize: pack function has name and entry point into flags. |
| - nir/serialize: fix serializing functions with no implementations. |
| - spirv: don't store 0 to cs.ptr_size for non kernel stages. |
| - spirv: get the correct type for function returns. |
| - spirv/nir/opencl: handle some multiply instructions. |
| - nir: add 64-bit ufind_msb lowering support. (v2) |
| - nouveau: request ufind_msb64 lowering in the frontend. |
| - vtn/opencl: add clz support |
| - nir: fix deref offset builder |
| - llvmpipe: initial query buffer object support. (v2) |
| - docs: add llvmpipe to ARB_query_buffer_object. |
| - gallivm: split out the flow control ir to a common file. |
| - gallivm: nir->tgsi info convertor (v2) |
| - gallivm: add popcount intrinsic wrapper |
| - gallivm: add cttz wrapper |
| - gallivm: add selection for non-32 bit types |
| - gallivm: add nir->llvm translation (v2) |
| - draw: add nir info gathering and building support |
| - gallium: add nir lowering passes for the draw pipe stages. (v2) |
| - gallivm: add swizzle support where one channel isn't defined. |
| - llvmpipe: add initial nir support |
| - nir/samplers: don't zero samplers_used/txf. |
| - llvmpipe/images: handle undefined atomic without crashing |
| - gallivm/llvmpipe: add support for front facing in sysval. |
| - llvmpipe: enable texcoord semantics |
| - gallium/scons: fix graw-xlib build on OSX. |
| - llvmpipe: add queries disabled flag |
| - llvmpipe: disable occlusion queries when requested by state tracker |
| - draw: add support for collecting primitives generated outside |
| streamout |
| - llvmpipe: enable support for primitives generated outside streamout |
| - aco: handle gfx7 int8/10 clamping on exports |
| - gallivm: add bitfield reverse and ufind_msb |
| - llvmpipe/nir: handle texcoord requirements |
| - gallivm: fix transpose for when first channel isn't created |
| - gallivm: fix perspective enable if usage_mask doesn't have 0 bit set |
| - gallivm/nir: cleanup code and call cmp wrapper |
| - gallivm/nir: copy compare ordering code from tgsi |
| - gallivm: add base instance sysval support |
| - gallivm/draw: add support for draw_id system value. |
| - gallivm: fixup base_vertex support |
| - llvmpipe: enable ARB_shader_draw_parameters. |
| - vtn: convert vload/store to single value loops |
| - vtn/opencl: add shuffle/shuffle support |
| - gallivm/nir: wrap idiv to avoid divide by 0 (v2) |
| - llvmpipe: switch to NIR by default |
| - nir: sanitize work group intrinsics to always be 32-bit. |
| - gallivm: add 64-bit const int creator. |
| - llvmpipe/gallivm: add kernel inputs |
| - gallivm: add support for 8-bit/16-bit integer builders |
| - gallivm: pick integer builders for alu instructions. |
| - gallivm/nir: allow 8/16-bit conversion and comparison. |
| - tgsi/mesa: handle KERNEL case |
| - gallivm/llvmpipe: add support for work dimension intrinsic. |
| - gallivm/llvmpipe: add support for block size intrinsic |
| - gallivm/llvmpipe: add support for global operations. |
| - llvmpipe: handle serialized nir as a shader type. |
| - llvmpipe: add support for compute shader params |
| - llvmpipe/nir: use nir_max_vec_components in more places |
| - gallivm: handle non-32 bit undefined |
| - llvmpipe: lower hadd/add_sat |
| - gallivm/nir: lower packing |
| - gallivm/nir: add vec8/16 support |
| - llvmpipe: add debug option to enable OpenCL support. |
| - gallivm: fixup const int64 builder. |
| - llvmpipe: enable ARB_shader_group_vote. |
| - gallium/util: add multi_draw_indirect to util_draw_indirect. |
| - llvmpipe: enable driver side multi draw indirect |
| - llvmpipe: add support for ARB_indirect_parameters. |
| - llvmpipe: add ARB_derivative_control support |
| - gallivm: fix gather component handling. |
| - llvmpipe: fix some integer instruction lowering. |
| - galllivm: fix gather offset casting |
| - gallivm: fix find lsb |
| - gallivm/nir: add missing break for isub. |
| |
| David Heidelberg (1): |
| |
| - .mailmap: use correct email address |
| |
| David Stevens (1): |
| |
| - virgl: support emulating planar image sampling |
| |
| Denis Pauk (2): |
| |
| - gallium/swr: Enable support bptc format. |
| - docs/features: mark GL_ARB_texture_compression_bptc as done for |
| llvmpipe, softpipe, swr |
| |
| Dongwon Kim (3): |
| |
| - gallium: enable INTEL_PERFORMANCE_QUERY |
| - iris: INTEL performance query implementation |
| - gallium: check all planes' pipe formats in case of multi-samplers |
| |
| Drew Davenport (1): |
| |
| - radeonsi: Clear uninitialized variable |
| |
| Drew DeVault (1): |
| |
| - st_get_external_sampler_key: improve error message |
| |
| Duncan Hopkins (1): |
| |
| - zink: make sure src image is transfer-src-optimal |
| |
| Dylan Baker (69): |
| |
| - Bump VERSION to 20.0.0-devel |
| - docs/new_features: Empty the feature list for the 20.0 cycle |
| - nir: correct use of identity check in python |
| - r200: use preprocessor for big vs little endian checks |
| - r100: Use preprocessor to select big vs little endian paths |
| - dri/osmesa: use preprocessor for selecting endian code paths |
| - util/u_endian: Use \_WIN32 instead of \_MSC_VER |
| - util/u_endian: set PIPE_ARCH_*_ENDIAN to 1 |
| - mesa/main: replace uses of \_mesa_little_endian with preprocessor |
| - mesa/swrast: replace instances of \_mesa_little_endian with |
| preprocessor |
| - mesa/main: delete now unused \_mesa_little_endian |
| - gallium/osmesa: Use PIPE_ARCH_*_ENDIAN instead of little_endian |
| function |
| - util: rename PIPE_ARCH_*_ENDIAN to UTIL_ARCH_*_ENDIAN |
| - util/u_endian: Add error checks |
| - meson: Add dep_glvnd to egl deps when building with glvnd |
| - docs: add release notes for 19.2.3 |
| - docs: add sha256 sum to 19.2.3 release notes |
| - docs: update calendar, add news item and link release notes for |
| 19.2.2 |
| - meson: gtest needs pthreads |
| - gallium/osmesa: Convert osmesa test to gtest |
| - osmesa/tests: Extend render test to cover other working cases |
| - util: Use ZSTD for shader cache if possible |
| - docs: Add release notes for 19.2.4 |
| - docs: Add SHA256 sum for for 19.2.4 |
| - docs: update calendar, add news item and link release notes for |
| 19.2.4 |
| - docs: Add relnotes for 19.2.5 |
| - docs/relnotes/19.2.5: Add SHA256 sum |
| - docs: update calendar, add news item and link release notes for |
| 19.2.5 |
| - docs/release-calendar: Update for extended 19.3 rc period |
| - docs: Add release notes for 19.2.6 |
| - docs: Add SHA256 sum for 19.2.6 |
| - docs: update calendar, add news item and link release notes for |
| 19.2.6 |
| - gallium/auxiliary: Fix uses of gnu struct = {} extension |
| - meson: Add -Werror=gnu-empty-initializer to MSVC compat args |
| - docs: Add release notes for 19.2.7 |
| - docs: Add SHA256 sums for 19.2.7 |
| - docs: update calendar, add news item and link release notes for |
| 19.2.7 |
| - docs: Update mesa 19.3 release calendar |
| - meson/broadcom: libbroadcom_cle needs expat headers |
| - meson/broadcom: libbroadcom_cle also needs zlib |
| - docs: add release notes for 19.3.0 |
| - docs/19.3.0: Add SHA256 sums |
| - docs: Update release notes, index, and calendar for 19.3.0 |
| - dcos: add releanse notes for 19.3.1 |
| - docs: Add release notes, update calendar, and add news for 19.3.1 |
| - docs: add relnotes for 19.2.8 |
| - docs/relnotes/19.2.8: Add SHA256 sum |
| - docs: Add release notes, news, and update calendar for 19.2.8 |
| - docs: Add release notes for 19.3.2 |
| - docs: add SHA256 sums for 19.3.2 |
| - docs: Add release notes for 19.3.2, update calendar and home page |
| - docs: Update release calendar for 20.0 |
| - docs: Add relnotes for 19.3.3 release |
| - docs: Add SHA 256 sums for 19.3.3 |
| - docs: update news, calendar, and link release notes for 19.3.3 |
| - VERSION: bump to 20.0.0-rc1 |
| - bin/pick-ui: Add a new maintainer script for picking patches |
| - .pick_status.json: Update to 0d14f41625fa00187f690f283c1eb6a22e354a71 |
| - .pick_status.json: Update to b550b7ef3b8d12f533b67b1a03159a127a3ff34a |
| - .pick_status.json: Update to 9afdcd64f2c96f3fcc1a28912987f2e8066aa995 |
| - .pick_status.json: Update to 7eaf21cb6f67adbe0e79b80b4feb8c816a98a720 |
| - VERSION: bump to 20.0-rc2 |
| - .pick_status.json: Update to d8bae10bfe0f487dcaec721743cd51441bcc12f5 |
| - .pick_status.json: Update to 689817c9dfde9a0852f2b2489cb0fa93ffbcb215 |
| - .pick_status.json: Update to 23037627359e739c42b194dec54875aefbb9d00b |
| - VERSION: bump for 20.0.0-rc3 |
| - .pick_status.json: Update to 2a98cf3b2ecea43cea148df7f77d2abadfd1c9db |
| - .pick_status.json: Update to 946eacbafb47c8b94d47e7c9d2a8b02fff5a22fa |
| - .pick_status.json: Update to bee5c9b0dc13dbae0ccf124124eaccebf7f2a435 |
| |
| Eduardo Lima Mitev (2): |
| |
| - turnip: Remove failed command buffer from pool |
| - turnip: Fix issues in tu_compute_pipeline_create() that may lead to |
| crash |
| |
| Elie Tournier (4): |
| |
| - Docs: remove duplicate meson docs for windows |
| - docs: fix ascii html representation |
| - nir/algebraic: i2f(f2i()) -> trunc() |
| - nir/algebraic: sqrt(x)*sqrt(x) -> fabs(x) |
| |
| Emmanuel Gil Peyrot (1): |
| |
| - intel/compiler: Return early if read() failed |
| |
| Eric Anholt (102): |
| |
| - ci: Make lava inherit the ccache setup of the .build script. |
| - ci: Switch over to an autoscaling GKE cluster for builds. |
| - Revert "ci: Switch over to an autoscaling GKE cluster for builds." |
| - mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium. |
| - gallium: Add defines for FXT1 texture compression. |
| - gallium: Add some more channel orderings of packed formats. |
| - gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8. |
| - gallium: Add equivalents of packed MESA_FORMAT_*UINT formats. |
| - mesa: Stop defining a full separate format for RGBA_UINT8. |
| - mesa/st: Test round-tripping of all compressed formats. |
| - mesa: Prepare for the MESA_FORMAT\_\* enum to be sparse. |
| - mesa: Redefine MESA_FORMAT\_\* in terms of PIPE_FORMAT_*. |
| - mesa/st: Gut most of st_mesa_format_to_pipe_format(). |
| - mesa/st: Make st_pipe_format_to_mesa_format an effective no-op. |
| - u_format: Fix swizzle of A1R5G5B5. |
| - ci: Use several debian buster packages instead of hand-building. |
| - ci: Make the skip list regexes match the full test name. |
| - ci: Use cts_runner for our dEQP runs. |
| - ci: Enable all of GLES3/3.1 testing for softpipe. |
| - ci: Remove old commented copy of freedreno artifacts. |
| - ci: Disable flappy blit tests on a630. |
| - ci: Expand the freedreno blit skip regex to cover more cases. |
| - util: Move gallium's PIPE_FORMAT utils to /util/format/ |
| - mesa: Move compile of common Mesa core files to a static lib. |
| - mesa/st: Simplify st_choose_matching_format(). |
| - mesa: Don't put sRGB formats in the array format table. |
| - mesa/st: Reuse st_choose_matching_format from st_choose_format(). |
| - util: Add a mapping from VkFormat to PIPE_FORMAT. |
| - turnip: Drop the copy of the formats table. |
| - ci: Move freedreno's parallelism to the runner instead of gitlab-ci |
| jobs. |
| - ci: Use a tag from the parallel-deqp-runner repo. |
| - nir: Add a scheduler pass to reduce maximum register pressure. |
| - nir: Refactor algebraic's block walk |
| - nir: Make algebraic backtrack and reprocess after a replacement. |
| - freedreno: Introduce a fd_resource_layer_stride() helper. |
| - freedreno: Introduce a fd_resource_tile_mode() helper. |
| - freedreno: Introduce a resource layout header. |
| - freedreno: Convert the slice struct to the new resource header. |
| - freedreno/a6xx: Log the tiling mode in resource layout debug. |
| - turnip: Disable timestamp queries for now. |
| - turnip: Fix unused variable warnings. |
| - turnip: Drop redefinition of VALIDREG now that it's in ir3.h. |
| - turnip: Reuse tu6_stage2opcode() more. |
| - turnip: Add basic SSBO support. |
| - turnip: Refactor the graphics pipeline create implementation. |
| - turnip: Add a helper function for getting tu_buffer iovas. |
| - turnip: Sanity check that we're adding valid BOs to the list. |
| - turnip: Move pipeline BO list adding to BindPipeline. |
| - turnip: Add support for compute shaders. |
| - ci: Disable egl_ext_device_drm tests in piglit. |
| - freedreno: Enable texture upload memory throttling. |
| - freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off. |
| - freedreno: Track the set of UBOs to be uploaded in UBO analysis. |
| - freedreno: Drop the extra offset field for mipmap slices. |
| - freedreno: Refactor the UBWC flags registers emission. |
| - freedreno: Move UBWC layout into a slices array like the non-UBWC |
| slices. |
| - tu: Move our image layout into a freedreno_layout struct. |
| - freedreno: Move a6xx's setup_slices() to a shareable helper function. |
| - freedreno: Switch the 16-bit workaround to match what turnip does. |
| - tu: Move UBWC layout into fdl6_layout() and use that function. |
| - turnip: Lower usub_borrow. |
| - turnip: Drop unused variable. |
| - turnip: Add support for descriptor arrays. |
| - turnip: Fix support for immutable samplers. |
| - ci: Fix caselist results archiving after parallel-deqp-runner rename. |
| - mesa: Fix detection of invalidating both depth and stencil. |
| - mesa/st: Deduplicate the NIR uniform lowering code. |
| - mesa/st: Move the vec4 type size function into core GLSL types. |
| - mesa/prog: Reuse count_vec4_slots() from ir_to_mesa. |
| - mesa/st: Move the dword slot counting function to glsl_types as well. |
| - i965: Reuse the new core glsl_count_dword_slots(). |
| - nir: Fix printing of ~0 .locations. |
| - turnip: Refactor linkage state setup. |
| - mesa: Make atomic lowering put atomics above SSBOs. |
| - gallium: Pack the atomic counters just above the SSBOs. |
| - nir: Drop the ssbo_offset to atomic lowering. |
| - compiler: Add a note about how num_ssbos works in the program info. |
| - freedreno: Stop scattered remapping of SSBOs/images to IBOs. |
| - radeonsi: Remove a bunch of default handling of pipe caps. |
| - r600: Remove a bunch of default handling of pipe caps. |
| - r300: Remove a bunch of default handling of pipe caps. |
| - radeonsi: Drop PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS. |
| - turnip: Fix some whitespace around binary operators. |
| - turnip: Refactor the intrinsic lowering. |
| - turnip: Add limited support for storage images. |
| - turnip: Disable UBWC on images used as storage images. |
| - turnip: Add support for non-zero (still constant) UBO buffer indices. |
| - turnip: Add support for uniform texel buffers. |
| - freedreno/ir3: Plumb the ir3_shader_variant into legalize. |
| - turnip: Add support for fine derivatives. |
| - turnip: Fix execution of secondary cmd bufs with nothing in primary. |
| - freedreno: Add some missing a6xx address declarations. |
| - freedreno: Fix OUT_REG() on address regs without a .bo supplied. |
| - turnip: Port krh's packing macros from freedreno to tu. |
| - turnip: Convert renderpass setup to the new register packing macros. |
| - turnip: Convert the rest of tu_cmd_buffer.c over to the new pack |
| macros. |
| - vulkan/wsi: Fix compiler warning when no WSI platforms are enabled. |
| - iris: Silence warning about AUX_USAGE_MC. |
| - mesa/st: Fix compiler warnings from INTEL_shader_integer_functions. |
| - ci: Enable -Werror on the meson-i386 build. |
| - tu: Fix binning address setup after pack macros change. |
| - Revert "gallium: Fix big-endian addressing of non-bitmask array |
| formats." |
| |
| Eric Engestrom (58): |
| |
| - meson: split out idep_xmlconfig_headers from idep_xmlconfig |
| - anv: add missing xmlconfig headers dependency |
| - radv: drop unnecessary xmlpool_options_h |
| - pipe-loader: drop unnecessary xmlpool_options_h |
| - loader: replace xmlpool_options_h with idep_xmlconfig_headers |
| - targets/omx: replace xmlpool_options_h with idep_xmlconfig_headers |
| - targets/va: replace xmlpool_options_h with idep_xmlconfig_headers |
| - targets/vdpau: replace xmlpool_options_h with idep_xmlconfig_headers |
| - targets/xa: replace xmlpool_options_h with idep_xmlconfig_headers |
| - targets/xvmc: replace xmlpool_options_h with idep_xmlconfig_headers |
| - dri: replace xmlpool_options_h with idep_xmlconfig_headers |
| - i915: replace xmlpool_options_h with idep_xmlconfig_headers |
| - nouveau: replace xmlpool_options_h with idep_xmlconfig_headers |
| - r200: replace xmlpool_options_h with idep_xmlconfig_headers |
| - radeon: replace xmlpool_options_h with idep_xmlconfig_headers |
| - meson: move idep_xmlconfig_headers to xmlpool/ |
| - gitlab-ci: build a recent enough version of GLVND (ie. 1.2.0) |
| - meson: require glvnd 1.2.0 |
| - meson: revert glvnd workaround |
| - meson: add variable to control the symbols checks |
| - meson: move the generic symbols check arguments to a common variable |
| - meson: add windows support to symbols checks |
| - meson: require \`nm\` again on Unix systems |
| - mesa/imports: let the build system detect strtok_r() |
| - egl: fix \_EGL_NATIVE_PLATFORM fallback |
| - egl: move #include of local headers out of Khronos headers |
| - gitlab-ci: build libdrm using meson instead of autotools |
| - gitlab-ci: auto-cancel CI runs when a newer commit is pushed to the |
| same branch |
| - CL: sync C headers with Khronos |
| - CL: sync C++ headers with Khronos |
| - vulkan: delete typo'd header |
| - egl: use EGL_CAST() macro in eglmesaext.h |
| - anv: add missing "fall-through" annotation |
| - vk_util: drop duplicate formats in vk_format_map[] |
| - meson: drop duplicate \`lib\` prefix on libiris_gen\* |
| - meson: drop \`intel_\` prefix on imgui_core |
| - docs: reword a bit and list HTTPS before FTP |
| - intel: add mi_builder_test for gen12 |
| - intel/compiler: add ASSERTED annotation to avoid "unused variable" |
| warning |
| - intel/compiler: replace \`0\` pointer with \`NULL\` |
| - util/simple_mtx: don't set the canary when it can't be checked |
| - anv: drop unused #include |
| - travis: autodetect python version instead of hard-coding it |
| - util/format: remove left-over util_format_description_table |
| declaration |
| - util/format: add PIPE_FORMAT_ASTC_*x*x*_SRGB to |
| util_format_{srgb,linear}() |
| - util/format: add trivial srgb<->linear conversion test |
| - u_format: move format tests to util/tests/ |
| - amd: fix empty-body issues |
| - nine: fix empty-body-issues |
| - meson: simplify install_megadrivers.py invocation |
| - mesa: avoid returning a value in a void function |
| - meson: use github URL for wraps instead of completely unreliable |
| wrapdb |
| - egl: drop confusing mincore() error message |
| - llvmpipe: drop LLVM < 3.4 support |
| - util/atomic: fix return type of p_atomic_add_return() fallback |
| - util/os_socket: fix header unavailable on windows |
| - freedreno/perfcntrs: fix fd leak |
| - util/disk_cache: check for write() failure in the zstd path |
| |
| Erico Nunes (17): |
| |
| - lima: fix nir shader memory leak |
| - lima: fix bo submit memory leak |
| - lima/ppir: enable lower_fdph |
| - gallium/util: add alignment parameter to util_upload_index_buffer |
| - lima: allocate separate bo to store varyings |
| - lima: refactor indexed draw indices upload |
| - vc4: move the draw splitting routine to shared code |
| - lima: split draw calls on 64k vertices |
| - lima/ppir: fix lod bias src |
| - lima/ppir: remove assert on ppir_emit_tex unsupported feature |
| - lima: set shader caps to optimize control flow |
| - lima/ppir: remove orphan load node after cloning |
| - lima/ppir: implement full liveness analysis for regalloc |
| - lima/ppir: handle write to dead registers in ppir |
| - lima/ppir: fix ssa undef emit |
| - lima/ppir: split ppir_op_undef into undef and dummy again |
| - lima/ppir: fix src read mask swizzling |
| |
| Erik Faye-Lund (82): |
| |
| - zink: heap-allocate samplers objects |
| - zink: emit line-width when using polygon line-mode |
| - anv: remove incorrect polygonMode=point early-out |
| - zink: use actual format for render-pass |
| - zink: always allow mutating the format |
| - zink: do not advertize coherent mapping |
| - zink: disable fragment-shader texture-lod |
| - zink: transition resources before resolving |
| - zink: always allow sampling of images |
| - zink: use u_blitter when format-reinterpreting |
| - zink/spirv: drop temp-array for component-count |
| - zink/spirv: support loading bool constants |
| - zink/spirv: implement bany_fnequal[2-4] |
| - zink/spirv: implement bany_inequal[2-4] |
| - zink/spirv: implement ball_iequal[2-4] |
| - zink/spirv: implement ball_fequal[2-4] |
| - zink: do advertize integer support in shaders |
| - zink/spirv: add support for nir_op_flrp |
| - zink: correct depth-stencil format |
| - nir: patch up deref-vars when lowering clip-planes |
| - zink: always allow transfer to/from buffers |
| - zink: implement buffer-to-buffer copies |
| - zink: remove no-longer-needed hack |
| - zink: move format-checking to separate source |
| - zink: move filter-helper to separate helper-header |
| - zink: move blitting to separate source |
| - zink: move drawing separate source |
| - st/mesa: unmap pbo after updating cache |
| - zink: use true/false instead of TRUE/FALSE |
| - zink: reject invalid sample-counts |
| - zink: fix crash when restoring sampler-states |
| - zink: delete query rather than allocating a new one |
| - zink: do not try to destroy NULL-fence |
| - zink: handle calloc-failure |
| - zink: avoid NULL-deref |
| - zink: avoid NULL-deref |
| - zink: avoid NULL-deref |
| - zink: error-check right variable |
| - zink: silence coverity error |
| - zink: enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS |
| - zink: implement nir_texop_txd |
| - zink: implement txf |
| - zink: implement some more trivial opcodes |
| - zink: simplify front-face type |
| - zink: factor out builtin-var creation |
| - zink: implement load_vertex_id |
| - zink: use nir_fmul_imm |
| - zink: remove unused code-path in lower_pos_write |
| - nir/zink: move clip_halfz-lowering to common code |
| - etnaviv: use nir_lower_clip_halfz instead of open-coding |
| - st/mesa: use uint-samplers for sampling stencil buffers |
| - zink: fixup initialization of operand_mask / num_extra_operands |
| - util: initialize float-array with float-literals |
| - st/wgl: eliminate implicit cast warning |
| - gallium: fix a warning |
| - mesa/st: use float literals |
| - docs: fix typo in html tag name |
| - docs: fix paragraphs |
| - docs: open paragraph before closing it |
| - docs: use code-tag instead of pre-tag |
| - docs: use code-tags instead of pre-tags |
| - docs: use code-tags instead of pre-tags |
| - docs: move paragraph closing tag |
| - docs: remove double-closed definition-list |
| - docs: do not double-close link tag |
| - docs: do not use definition-list for sub-topics |
| - docs: use figure/figcaption instead of tables |
| - docs: remove trailing header |
| - docs: remove leading spaces |
| - docs: remove trailing newlines |
| - docs: use [1] instead of asterisk for footnote |
| - docs: remove pointless, stray newline |
| - docs: fixup indentation |
| - zink: implement nir_texop_txs |
| - zink: support offset-variants of texturing |
| - zink: avoid incorrect vector-construction |
| - zink: store image-type per texture |
| - zink: support sampling non-float textures |
| - zink: support arrays of samplers |
| - zink: set compareEnable when setting compareOp |
| - st/mesa: use uint-result for sampling stencil buffers |
| - Revert "nir: Add a couple trivial abs optimizations" |
| |
| Florian Will (1): |
| |
| - radv/winsys: set IB flags prior to submit in the sysmem path |
| |
| Francisco Jerez (26): |
| |
| - glsl: Fix software 64-bit integer to 32-bit float conversions. |
| - intel/fs/gen11+: Handle ROR/ROL in lower_simd_width(). |
| - intel/fs/gen8+: Fix r127 dst/src overlap RA workaround for EOT |
| message payload. |
| - intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32. |
| - intel/fs/cse: Fix non-deterministic behavior due to inaccurate |
| liveness calculation. |
| - intel/fs: Make implied_mrf_writes() an fs_inst method. |
| - intel/fs: Try to vectorize header setup in lower_load_payload(). |
| - intel/fs: Generalize fs_reg::is_contiguous() to register files other |
| than VGRF. |
| - intel/fs: Rework fs_inst::is_copy_payload() into multiple |
| classification helpers. |
| - intel/fs: Extend copy propagation dataflow analysis to copies with |
| FIXED_GRF source. |
| - intel/fs: Add partial support for copy-propagating FIXED_GRFs. |
| - intel/fs: Add support for copy-propagating a block of multiple |
| FIXED_GRFs. |
| - intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into |
| another. |
| - intel/fs/gen4-6: Allocate registers from aligned_pairs_class based on |
| LINTERP use. |
| - intel/fs/gen6: Constrain barycentric source of LINTERP during bank |
| conflict mitigation. |
| - intel/fs/gen6: Generalize aligned_pairs_class to SIMD16 aligned |
| barycentrics. |
| - intel/fs/gen6: Use SEL instead of bashing thread payload for unlit |
| centroid workaround. |
| - intel/fs: Split fetch_payload_reg() into separate helper for |
| barycentrics. |
| - intel/fs: Introduce barycentric layout lowering pass. |
| - intel/fs: Switch to standard vector layout for barycentrics at |
| optimization time. |
| - intel/fs/cse: Make HALT instruction act as CSE barrier. |
| - intel/fs/gen7: Fix fs_inst::flags_written() for |
| SHADER_OPCODE_FIND_LIVE_CHANNEL. |
| - intel/fs: Add virtual instruction to load mask of live channels into |
| flag register. |
| - intel/fs/gen12: Workaround unwanted SEND execution due to broken |
| NoMask control flow. |
| - intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch |
| writes. |
| - intel/fs/gen12: Workaround data coherency issues due to broken NoMask |
| control flow. |
| |
| Fritz Koenig (1): |
| |
| - freedreno: reorder format check |
| |
| Georg Lehmann (3): |
| |
| - Correctly wait in the fragment stage until all semaphores are |
| signaled |
| - Vulkan Overlay: Don't try to change the image layout to present twice |
| - Vulkan overlay: use the corresponding image index for each swapchain |
| |
| Gert Wollny (12): |
| |
| - r600: Disable eight bit three channel formats |
| - virgl: Increase the shader transfer buffer by doubling the size |
| - gallium/tgsi_from_mesa: Add 'extern "C"' to be able to include from |
| C++ |
| - nir: make nir_get_texture_size/lod available outside nir_lower_tex |
| - gallium: tgsi_from_mesa - handle VARYING_SLOT_FACE |
| - r600: Add functions to dump the shader info |
| - r600: Make it possible to include r600_asm.h in a C++ file |
| - r600/sb: Correct SB disassambler for better debugging |
| - r600: Fix maximum line width |
| - r600: Make SID and unsigned value |
| - r600: Delete vertex buffer only if there is actually a shader state |
| - mesa/st: glsl_to_nir: don't lower atomics to SSBOs if driver supports |
| HW atomics |
| |
| Guido Günther (2): |
| |
| - etnaviv: drm: Don't miscalculate timeout |
| - freedreno/drm: Don't miscalculate timeout |
| |
| Gurchetan Singh (11): |
| |
| - drirc: set allow_higher_compat_version for Faster Than Light |
| - virgl/drm: update UAPI |
| - teximage: split out helper from EGLImageTargetTexture2DOES |
| - glapi / teximage: implement EGLImageTargetTexStorageEXT |
| - dri_util: add driImageFormatToSizedInternalGLFormat function |
| - i965: track if image is created by a dmabuf |
| - i965: refactor intel_image_target_texture_2d |
| - i965: support EXT_EGL_image_storage |
| - st/dri: track if image is created by a dmabuf |
| - st/mesa: refactor egl image binding a bit |
| - st/mesa: implement EGLImageTargetTexStorage |
| |
| Hyunjun Ko (7): |
| |
| - freedreno/ir3: cleanup by removing repeated code |
| - freedreno: support 16b for the sampler opcode |
| - freedreno/ir3: fix printing output registers of FS. |
| - freedreno/ir3: fixup when changing to mad.f16 |
| - freedreno/ir3: enable half precision for pre-fs texture fetch |
| - turnip: fix invalid VK_ERROR_OUT_OF_POOL_MEMORY |
| - freedreno/ir3: put the conversion back for half const to the right |
| place. |
| |
| Iago Toral Quiroga (32): |
| |
| - v3d: rename vertex shader key (num)_fs_inputs fields |
| - mesa/st: make sure we remove dead IO variables before handing NIR to |
| backends |
| - glsl: add missing initialization of the location path field |
| - v3d: fix indirect BO allocation for uniforms |
| - v3d: actually root the first BO in a command list in the job |
| - v3d: add missing plumbing for VPM load instructions |
| - v3d: add debug assert |
| - v3d: enable debug options for geometry shader dumps |
| - v3d: remove unused variable |
| - v3d: add initial compiler plumbing for geometry shaders |
| - v3d: fix packet descriptions for geometry and tessellation shaders |
| - v3d: emit geometry shader state commands |
| - v3d: implement geometry shader instancing |
| - v3d: add 1-way SIMD packing definition |
| - v3d: compute appropriate VPM memory configuration for geometry shader |
| workloads |
| - v3d: we always have at least one output segment |
| - v3d: add support for adjacency primitives |
| - v3d: don't try to render if shaders failed to compile |
| - v3d: predicate geometry shader outputs inside non-uniform control |
| flow |
| - v3d: save geometry shader state for blitting |
| - v3d: support transform feedback with geometry shaders |
| - v3d: remove obsolete assertion |
| - v3d: do not limit new CL space allocations with branch to 4096 bytes |
| - v3d: support rendering to multi-layered framebuffers |
| - v3d: move layer rendering to a separate helper |
| - v3d: handle writes to gl_Layer from geometry shaders |
| - v3d: fix primitive queries for geometry shaders |
| - v3d: disable lowering of indirect inputs |
| - v3d: support precompiling geometry shaders |
| - v3d: expose OES_geometry_shader |
| - u_vbuf: don't try to delete NULL driver CSO |
| - v3d: fix bug when checking result of syncobj fence import |
| |
| Ian Romanick (39): |
| |
| - intel/compiler: Report the number of non-spill/fill SEND messages on |
| vec4 too |
| - nir/algebraic: Add the ability to mark a replacement as exact |
| - nir/algebraic: Mark other comparison exact when removing a == a |
| - intel/fs: Disable conditional discard optimization on Gen4 and Gen5 |
| - nir/range-analysis: Add pragmas to help loop unrolling |
| - nir/range_analysis: Make sure the table validation only occurs once |
| - nir/opt_peephole_select: Don't count some unary operations |
| - intel/compiler: Increase nir_opt_peephole_select threshold |
| - nir/algebraic: Simplify some Inf and NaN avoidance code |
| - nir/algebraic: Rearrange bcsel sequences generated by |
| nir_opt_peephole_select |
| - intel/compiler: Fix 'comparison is always true' warning |
| - mesa: Silence 'left shift of negative value' warning in BPTC |
| compression code |
| - mesa: Silence unused parameter warning |
| - anv: Fix error message format string |
| - mesa: Extension boilerplate for INTEL_shader_integer_functions2 |
| - glsl: Add new expressions for INTEL_shader_integer_functions2 |
| - glsl_types: Add function to get an unsigned base type from a signed |
| type |
| - glsl: Add built-in functions for INTEL_shader_integer_functions2 |
| - nir: Add new instructions for INTEL_shader_integer_functions2 |
| - nir/algebraic: Add lowering for uabs_usub and uabs_isub |
| - nir/algebraic: Add lowering for 64-bit hadd and rhadd |
| - nir/algebraic: Add lowering for 64-bit usub_sat |
| - nir/algebraic: Add lowering for 64-bit uadd_sat |
| - nir/algebraic: Add lowering for 64-bit iadd_sat and isub_sat |
| - compiler: Translate GLSL IR to NIR for new |
| INTEL_shader_integer_functions2 expressions |
| - intel/fs: Don't lower integer multiplies that don't need lowering |
| - intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops |
| - intel/fs: Implement support for NIR opcodes for |
| INTEL_shader_integer_functions2 |
| - nir/spirv: Translate SPIR-V to NIR for new |
| INTEL_shader_integer_functions2 opcodes |
| - spirv: Silence a bunch of unused parameter warnings |
| - spirv: Add support for IntegerFunctions2INTEL capability |
| - i965: Enable INTEL_shader_integer_functions2 on Gen8+ |
| - gallium: Add a cap bit for OpenCL-style extended integer functions |
| - gallium: Add a cap bit for integer multiplication between 32-bit and |
| 16-bit |
| - iris: Enable INTEL_shader_integer_functions2 |
| - anv: Enable SPV_INTEL_shader_integer_functions2 and |
| VK_INTEL_shader_integer_functions2 |
| - nir/algebraic: Optimize some 64-bit integer comparisons involving |
| zero |
| - relnotes: Add GL_INTEL_shader_integer_functions2 and |
| VK_INTEL_shader_integer_functions2 |
| - intel/fs: Don't count integer instructions as being possibly coissue |
| |
| Icecream95 (16): |
| |
| - gallium/auxiliary: Reduce conversions in |
| u_vbuf_get_minmax_index_mapped |
| - gallium/auxiliary: Handle count == 0 in |
| u_vbuf_get_minmax_index_mapped |
| - panfrost: Add negative lod bias support |
| - panfrost: Compact the bo_access readers array |
| - panfrost: Dynamically allocate shader variants |
| - panfrost: Add ETC1/ETC2 texture formats |
| - panfrost: Add ASTC texture formats |
| - pan/midgard: Fix bundle dynarray leak |
| - pan/midgard: Fix a memory leak in the disassembler |
| - pan/midgard: Support disassembling to a file |
| - pan/bifrost: Support disassembling to a file |
| - pan/decode: Support dumping to a file |
| - pan/decode: Dump to a file |
| - pan/decode: Rotate trace files |
| - panfrost: Don't copy uniforms when the size is zero |
| - pan/midgard: Fix a liveness info leak |
| |
| Icenowy Zheng (2): |
| |
| - lima: support indexed draw with bias |
| - lima: fix lima_set_vertex_buffers() |
| |
| Ilia Mirkin (7): |
| |
| - gm107/ir: fix loading z offset for layered 3d image bindings |
| - nv50/ir: mark STORE destination inputs as used |
| - nv50,nvc0: fix destination coordinates of blit |
| - nvc0: add dummy reset status support |
| - gm107/ir: avoid combining geometry shader stores at 0x60 |
| - nvc0: treat all draws without color0 broadcast as MRT |
| - nvc0: disable xfb's which don't have a stride |
| |
| Italo Nicola (1): |
| |
| - intel/compiler: remove old comment |
| |
| Iván Briano (4): |
| |
| - intel/compiler: Don't change hstride if not needed |
| - anv: Export filter_minmax support only when it's really supported |
| - anv: Export VK_KHR_buffer_device_address only when really supported |
| - anv: Enable Vulkan 1.2 support |
| |
| James Xiong (3): |
| |
| - iris: try to set the specified tiling when importing a dmabuf |
| - gallium: dmabuf support for yuv formats that are not natively |
| supported |
| - gallium: let the pipe drivers decide the supported modifiers |
| |
| Jan Vesely (2): |
| |
| - clover: Initialize Asm Parsers |
| - clover: Use explicit conversion from llvm::StringRef to std::string |
| |
| Jan Zielinski (8): |
| |
| - gallium/swr: Fix depth values for blit scenario |
| - swr/rasterizer: Add tessellator implementation to the rasterizer |
| - gallium/swr: Fix Windows build |
| - gallium/gallivm/tgsi: enable tessellation shaders |
| - gallium/gallivm: enable linking lp_bld_printf function with C++ code |
| - gallium/swr: implementation of tessellation shaders compilation |
| - gallium/swr: fix tessellation state save/restore |
| - docs: Update SWR tessellation support |
| |
| Jason Ekstrand (212): |
| |
| - util: Add a util_sparse_array data structure |
| - anv: Move refcount to anv_bo |
| - anv: Use a util_sparse_array for the GEM handle -> BO map |
| - anv: Fix a relocation race condition |
| - anv: Stop storing the GEM handle in anv_reloc_list_add |
| - anv: Declare the bo in the anv_block_pool_foreach_bo loop |
| - anv: Inline anv_block_pool_get_bo |
| - anv: Replace ANV_BO_EXTERNAL with anv_bo::is_external |
| - anv: Handle state pool relocations using "wrapper" BOs |
| - anv: Fix a potential BO handle leak |
| - anv: Rework anv_block_pool_expand_range |
| - anv: Use anv_block_pool_foreach_bo in get_bo_from_pool |
| - anv: Rework the internal BO allocation API |
| - anv: Choose BO flags internally in anv_block_pool |
| - anv/tests: Zero-initialize instances |
| - anv/tests: Initialize the BO cache and device mutex |
| - anv: Allocate block pool BOs from the cache |
| - anv: Use the query_slot helper in vkResetQueryPoolEXT |
| - anv: Allocate query pool BOs from the cache |
| - anv: Set more flags on descriptor pool buffers |
| - anv: Allocate descriptor buffers from the BO cache |
| - util: Add a free list structure for use with util_sparse_array |
| - anv: Allocate batch and fence buffers from the cache |
| - anv: Allocate scratch BOs from the cache |
| - anv: Allocate misc BOs from the cache |
| - anv: Drop anv_bo_init and anv_bo_init_new |
| - anv: Add a device parameter to anv_execbuf_add_bo |
| - anv: Set the batch allocator for compute pipelines |
| - anv: Use a bitset for tracking residency |
| - anv: Zero released anv_bo structs |
| - anv: Use the new BO alloc API for Android |
| - anv: Don't delete fragment shaders that write sample mask |
| - anv: Don't claim the null RT as a valid color target |
| - anv: Stop compacting render targets in the binding table |
| - anv: Move the RT BTI flush workaround to begin_subpass |
| - spirv: Remove the type from sampled_image |
| - spirv: Add a vtn_decorate_pointer helper |
| - spirv: Sort out the mess that is sampled image |
| - nir/builder: Add a nir_extract_bits helper |
| - nir: Add tests for nir_extract_bits |
| - intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes |
| - intel/fs: Add DWord scattered read/write opcodes |
| - intel/fs: refactor surface header setup |
| - intel/nir: Plumb devinfo through lower_mem_access_bit_sizes |
| - intel/fs: Implement the new load/store_scratch intrinsics |
| - intel/fs: Lower large local arrays to scratch |
| - anv: Lock around fetching sync file FDs from semaphores |
| - anv: Plumb timeline semaphore signal/wait values through from the API |
| - spirv: Fix the MSVC build |
| - anv/pipeline: Assume layout != NULL |
| - genxml: Mark everything in genX_pack.h always_inline |
| - anv: Input attachments are always single-plane |
| - anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout |
| - anv: Delete dead shader constant pushing code |
| - anv: Stop bounds-checking pushed UBOs |
| - anv: Pre-compute push ranges for graphics pipelines |
| - intel/compiler: Add a flag to avoid compacting push constants |
| - anv: Re-arrange push constant data a bit |
| - anv: Rework push constant handling |
| - anv: Use a switch statement for binding table setup |
| - anv: More carefully dirty state in BindDescriptorSets |
| - anv: More carefully dirty state in BindPipeline |
| - anv: Use an anv_state for the next binding table |
| - anv: Emit a NULL vertex for zero base_vertex/instance |
| - nir: Validate that variables are in the right lists |
| - iris: Re-enable param compaction |
| - Revert "i965/fs: Merge CMP and SEL into CSEL on Gen8+" |
| - vulkan/enum_to_str: Handle out-of-order aliases |
| - anv/entrypoints: Better handle promoted extensions |
| - vulkan: Update the XML and headers to 1.1.129 |
| - anv: Push constants are relative to dynamic state on IVB |
| - anv: Set up SBE_SWIZ properly for gl_Viewport |
| - anv: Respect the always_flush_cache driconf option |
| - iris: Stop setting up fake params |
| - anv: Drop bo_flags from anv_bo_pool |
| - anv: Add a has_softpin boolean |
| - blorp: Pass the VB size to the VF cache workaround |
| - anv: Always invalidate the VF cache in BeginCommandBuffer |
| - anv: Apply cache flushes after setting index/draw VBs |
| - anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA |
| - anv: Don't leak when set_tiling fails |
| - util/atomic: Add a \_return variant of p_atomic_add |
| - anv: Disallow allocating above heap sizes |
| - anv: Stop tracking VMA allocations |
| - anv: Set up VMA heaps independently from memory heaps |
| - anv: Stop advertising two heaps just for the VF cache WA |
| - anv: Add an explicit_address parameter to anv_device_alloc_bo |
| - util/vma: Factor out the hole splitting part of util_vma_heap_alloc |
| - util/vma: Add a function to allocate a particular address range |
| - anv: Add allocator support for client-visible addresses |
| - anv: Use a pNext loop in AllocateMemory |
| - anv: Implement VK_KHR_buffer_device_address |
| - util/atomic: Add p_atomic_add_return for the unlocked path |
| - vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit |
| - vulkan/wsi: Add a hooks for signaling semaphores and fences |
| - anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags |
| - anv: Use submit-time implicit sync instead of allocate-time |
| - anv: Add a fence_reset_reset_temporary helper |
| - anv: Use BO fences/semaphores for AcquireNextImage |
| - anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers |
| - anv: Re-capture all batch and state buffers |
| - anv: Re-emit all compute state on pipeline switch |
| - ANV: Stop advertising smoothLines support on gen10+ |
| - anv: Flush the queue on DeviceWaitIdle |
| - anv: Unconditionally advertise Vulkan 1.1 |
| - anv: Bump the advertised patch version to 129 |
| - i965: Enable GL_EXT_gpu_shader4 on Gen6+ |
| - anv: Properly advertise sampledImageIntegerSampleCounts |
| - anv: Drop unneeded struct keywords |
| - blorp: Stop whacking Z24 depth to BGRA8 |
| - blorp: Allow reading with HiZ |
| - i965/blorp: Don't resolve HiZ unless we're reinterpreting |
| - intel/blorp: Use the source format when using blorp_copy with HiZ |
| - anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9 |
| - i965: Allow HiZ for glCopyImageSubData sources |
| - intel/nir: Add a memory barrier before barrier() |
| - intel/disasm: Fix decoding of src0 of SENDS |
| - genxml: Remove a non-existant HW bit |
| - anv: Don't add dynamic state base address to push constants on Gen7 |
| - anv: Flag descriptors dirty when gl_NumWorkgroups is used |
| - anv: Re-use flush_descriptor_sets in flush_compute_state |
| - intel/vec4: Support scoped_memory_barrier |
| - nir: Handle more barriers in dead_write and copy_prop |
| - nir: Handle barriers with more granularity in combine_stores |
| - llmvpipe: No-op implement more barriers |
| - nir: Add a new memory_barrier_tcs_patch intrinsic |
| - spirv: Add a workaround for OpControlBarrier on old GLSLang |
| - spirv: Add output memory semantics to OpControlBarrier in TCS |
| - nir/glsl: Emit memory barriers as part of barrier() |
| - intel/nir: Stop adding redundant barriers |
| - nir: Rename nir_intrinsic_barrier to control_barrier |
| - nir/lower_atomics_to_ssbo: Also lower barriers |
| - anv: Drop an unused variable |
| - intel/blorp: Fill out all the dwords of MI_ATOMIC |
| - anv: Don't over-advertise descriptor indexing features |
| - anv: Memset array properties |
| - vulkan/wsi: Add a driconf option to force WSI to advertise |
| BGRA8_UNORM first |
| - vulkan: Update the XML and headers to 1.2.131 |
| - turnip: Pretend to support Vulkan 1.2 |
| - anv: Bump the patch version to 131 |
| - anv,nir: Lower quad_broadcast with dynamic index in NIR |
| - anv: Implement the new core version feature queries |
| - anv: Implement the new core version property queries |
| - relnotes: Add Vulkan 1.2 |
| - anv: Drop some VK_IMAGE_TILING_OPTIMAL checks |
| - anv: Support modifiers in GetImageFormatProperties2 |
| - vulkan/wsi: Move the ImageCreateInfo higher up |
| - vulkan/wsi: Use the interface from the real modifiers extension |
| - vulkan/wsi: Filter modifiers with ImageFormatProperties |
| - vulkan/wsi: Implement VK_KHR_swapchain_mutable_format |
| - anv/blorp: Rename buffer image stride parameters |
| - anv: Canonicalize buffer formats for image/buffer copies |
| - anv: Add an anv_physical_device field to anv_device |
| - anv: Take an anv_device in vk_errorf |
| - anv: Take a device in anv_perf_warn |
| - anv: Stop allocating WSI event fences off the instance |
| - anv: Drop the instance pointer from anv_device |
| - anv: Move the physical device dispatch table to anv_instance |
| - anv: Drop separate chipset_id fields |
| - anv: Re-arrange physical_device_init |
| - anv: Allow enumerating multiple physical devices |
| - anv/apply_pipeline_layout: Initialize the nir_builder before use |
| - intel/blorp: resize src and dst surfaces separately |
| - anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves |
| - anv: Add a layout_to_aux_state helper |
| - anv: Use isl_aux_state for HiZ resolves |
| - anv: Add a usage parameter to anv_layout_to_aux_usage |
| - anv: Allow HiZ in read-only depth layouts |
| - anv: Improve BTI change cache flushing |
| - intel/fs: Don't unnecessarily fall back to indirect sends on Gen12 |
| - intel/disasm: Properly disassemble indirect SENDs |
| - intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s) |
| - intel/isl: Add a hack for the Gen12 A0 texture buffer bug |
| - anv: Rework the meaning of anv_image::planes[]::aux_usage |
| - anv: Replace aux_surface.isl.size_B checks with aux_usage checks |
| - intel/aux-map: Add some #defines |
| - intel/aux-map: Factor out some useful helpers |
| - anv: Delete a redundant calculation |
| - isl: Add a helper for calculating subimage memory ranges |
| - anv: Add another align_down helper |
| - anv: Make AUX table invalidate a PIPE\_\* bit |
| - anv: Make anv_vma_alloc/free a lot dumber |
| - anv: Rework CCS memory handling on TGL-LP |
| - intel/blorp: Add support for CCS_E copies with UNORM formats |
| - intel/isl: Allow CCS_E on more formats |
| - intel/genxml: Make SO_DECL::"Hole Flag" a Boolean |
| - anv: Insert holes for non-existant XFB varyings |
| - intel/blorp: Handle bit-casting UNORM and BGRA formats |
| - anv: Replace one more aux_surface.isl.size_B check |
| - intel/mi_builder: Force write completion on Gen12+ |
| - anv: Set actual state pool sizes when we have softpin |
| - anv: Re-use one old BT block in reset_batch_bo_chain |
| - anv/block_pool: Ensure allocations have contiguous maps |
| - anv: Rename a variable |
| - genxml: Add a new 3DSTATE_SF field on gen12 |
| - anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+ |
| - intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11 |
| - iris: Set SLMEnable based on the L3$ config |
| - iris: Store the L3$ configs in the screen |
| - iris: Use the URB size from the L3$ config |
| - i965: Re-emit l3 state before BLORP executes |
| - intel: Take a gen_l3_config in gen_get_urb_config |
| - intel/blorp: Always emit URB config on Gen7+ |
| - iris: Consolodate URB emit |
| - anv: Emit URB setup earlier |
| - intel/common: Return the block size from get_urb_config |
| - intel/blorp: Plumb deref block size through to 3DSTATE_SF |
| - anv: Plumb deref block size through to 3DSTATE_SF |
| - iris: Plumb deref block size through to 3DSTATE_SF |
| - anv: Always fill out the AUX table even if CCS is disabled |
| - intel/fs: Write the address register with NoMask for MOV_INDIRECT |
| - anv/blorp: Use the correct size for vkCmdCopyBufferToImage |
| |
| Jonathan Gray (4): |
| |
| - winsys/amdgpu: avoid double simple_mtx_unlock() |
| - i965: update Makefile.sources for perf changes |
| - util/futex: use futex syscall on OpenBSD |
| - util/u_thread: don't restrict u_thread_get_time_nano() to \__linux_\_ |
| |
| Jonathan Marek (98): |
| |
| - freedreno: add Adreno 640 ID |
| - freedreno/ir3: disable texture prefetch for 1d array textures |
| - freedreno/registers: fix a6xx_2d_blit_cntl ROTATE |
| - etnaviv: blt: use only for tiling, and add missing formats |
| - etnaviv: separate PE and RS formats, use only RS only for tiling |
| - etnaviv: blt: set TS dirty after clear |
| - turnip: add display wsi |
| - turnip: add x11 wsi |
| - turnip: implement CmdClearColorImage/CmdClearDepthStencilImage |
| - turnip: fix sRGB GMEM clear |
| - util: add missing R8G8B8A8_SRGB format to vk_format_map |
| - freedreno/regs: update UBWC related bits |
| - turnip: implement UBWC |
| - etnaviv: avoid using RS for 64bpp formats |
| - etnaviv: implement 64bpp clear |
| - etnaviv: blt: fix partial ZS clears with TS |
| - etnaviv: support 3d/array/integer formats in texture descriptors |
| - turnip: fix integer render targets |
| - freedreno/registers: add missing MH perfcounter enum for a2xx |
| - freedreno/perfcntrs: add a2xx MH counters |
| - freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds |
| - freedreno/perfcntrs/fdperf: add missing a20x compatible |
| - freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter |
| - turnip: fix display wsi fence timing out |
| - turnip: don't skip unused attachments when setting up tiling config |
| - turnip: implement CmdClearAttachments |
| - turnip: don't set unused BLIT_DST_INFO bits for GMEM clear |
| - turnip: MSAA resolve directly from GMEM |
| - turnip: allow writes to draw_cs outside of render pass |
| - turnip: add function to allocate aligned memory in a substream cs |
| - turnip: improve emit_textures |
| - turnip: implement border color |
| - turnip: add hw binning |
| - turnip: fix incorrectly failing assert |
| - freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info |
| - freedreno/registers: add a6xx texture format for stencil sampler |
| - turnip: fix hw binning render area |
| - turnip: fix tile layout logic |
| - turnip: update tile_align_w/tile_align_h |
| - turnip: set load_layer_id to zero |
| - turnip: set FRAG_WRITES_SAMPMASK bit |
| - turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view |
| - turnip: no 8x msaa on 128bpp formats |
| - turnip: add dirty bit for push constants |
| - turnip: subpass rework |
| - turnip: CmdClearAttachments fixes |
| - turnip: implement subpass input attachments |
| - etnaviv: remove sRGB formats from format table |
| - etnaviv: sRGB render target support |
| - etnaviv: set output mode and saturate bits |
| - etnaviv: update INT_FILTER choice for GLES3 formats |
| - etnaviv: disable integer vertex formats on pre-HALTI2 hardware |
| - etnaviv: remove swizzle from format table |
| - etnaviv: add missing formats |
| - etnaviv: add missing vs_needs_z_div handling to NIR backend |
| - turnip: use single substream cs |
| - turnip: use common blit path for buffer copy |
| - turnip: don't require src image to be set for clear blits |
| - turnip: implement CmdFillBuffer/CmdUpdateBuffer |
| - freedreno/ir3: lower mul_2x32_64 |
| - turnip: fix emit_textures for compute shaders |
| - turnip: remove compute emit_border_color |
| - turnip: fix emit_ibo |
| - turnip: change emit_ibo to be like emit_textures |
| - turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO |
| - nir: add option to lower half packing opcodes |
| - freedreno/ir3: lower pack/unpack ops |
| - turnip: don't set LRZ enable at end of renderpass |
| - freedreno/ir3: update prefetch input_offset when packing inlocs |
| - turnip: add cache invalidate to fix input attachment cases |
| - turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used |
| - freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs |
| - freedreno/registers: document vertex/instance id offset bits |
| - freedreno/ir3: support load_base_instance |
| - turnip: emit base instance vs driver param |
| - turnip: emit_compute_driver_params fixes |
| - turnip: compute gmem offsets at renderpass creation time |
| - turnip: implement secondary command buffers |
| - nir: fix assign_io_var_locations for vertex inputs |
| - turnip: minor warning fixes |
| - util/format: add missing vulkan formats |
| - turnip: disable B8G8R8 vertex formats |
| - etnaviv: fix incorrectly failing vertex size assert |
| - etnaviv: update headers from rnndb |
| - etnaviv: HALTI2+ instanced draw |
| - etnaviv: implement gl_VertexID/gl_InstanceID |
| - etnaviv: remove unnecessary vertex_elements_state_create error |
| checking |
| - st/mesa: don't lower YUV when driver supports it natively |
| - st/mesa: run st_nir_lower_tex_src_plane for lowered xyuv/ayuv |
| - freedreno/ir3: allow inputs with the same location |
| - turnip: remove tu_sort_variables_by_location |
| - turnip: fix array/matrix varyings |
| - turnip: hook up GetImageDrmFormatModifierPropertiesEXT |
| - turnip: set linear tiling for scanout images |
| - vulkan/wsi: remove unused image_get_modifier |
| - turnip: simplify tu_physical_device_get_format_properties |
| - etnaviv: implement UBOs |
| - turnip: hook up cmdbuffer event set/wait |
| |
| Jordan Justen (7): |
| |
| - iris: Add IRIS_DIRTY_RENDER_BUFFER state flag |
| - iris/gen11+: Move flush for render target change |
| - iris: Allow max dynamic pool size of 2GB for gen12 |
| - intel: Remove unused Tigerlake PCI ID |
| - iris: Fix some indentation in iris_init_render_context |
| - iris: Emit CS Stall before Instruction Cache flush for gen12 WA |
| - anv: Emit CS Stall before Instruction Cache flush for gen12 WA |
| |
| Jose Maria Casanova Crespo (1): |
| |
| - v3d: Fix predication with atomic image operations |
| |
| Juan A. Suarez Romero (3): |
| |
| - nir/lower_double_ops: relax lower mod() |
| - Revert "nir/lower_double_ops: relax lower mod()" |
| - nir/spirv: skip unreachable blocks in Phi second pass |
| |
| Kai Wasserbäch (4): |
| |
| - nir: fix unused variable warning in nir_lower_vars_to_explicit_types |
| - nir: fix unused variable warning in |
| find_and_update_previous_uniform_storage |
| - nir: fix unused function warning in src/compiler/nir/nir.c |
| - intel/gen_decoder: Fix unused-but-set-variable warning |
| |
| Karol Herbst (14): |
| |
| - nv50/ir: fix crash in isUniform for undefined values |
| - nir/validate: validate num_components on registers and intrinsics |
| - nir/serialize: fix vec8 and vec16 |
| - nir/tests: add serializer tests |
| - nir/tests: MSVC build fix |
| - spirv: handle UniformConstant for OpenCL kernels |
| - clover/nir: treat UniformConstant as global memory |
| - clover/nir: set spirv environment to OpenCL |
| - clover/spirv: allow Int64 Atomics for supported devices |
| - nir: handle nir_deref_type_ptr_as_array in |
| rematerialize_deref_in_block |
| - nv50/ir: implement global atomics and handle it for nir |
| - nir/serialize: cast swizzle before shifting |
| - aco: use NIR_MAX_VEC_COMPONENTS instead of 4 |
| - nv50ir/nir: support vec8 and vec16 |
| |
| Kenneth Graunke (57): |
| |
| - iris: Fix "Force Zero RTA Index Enable" setting again |
| - nir: Handle image arrays when setting variable data |
| - Revert "intel/blorp: Fix usage of uninitialized memory in key |
| hashing" |
| - iris: Properly move edgeflag_out from output list to global list |
| - iris: Wrap iris_fix_edge_flags in NIR_PASS |
| - mesa: Handle GL_COLOR_INDEX in \_mesa_format_from_format_and_type(). |
| - iris: Change keybox parenting |
| - iris: Stop mutating the resource in get_rt_read_isl_surf(). |
| - iris: Drop 'old_address' parameter from iris_rebind_buffer |
| - iris: Create an "iris_surface_state" wrapper struct |
| - iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. |
| - iris: Update SURFACE_STATE addresses when setting sampler views |
| - iris: Disable VF cache partial address workaround on Gen11+ |
| - driconf, glsl: Add a vs_position_always_invariant option |
| - drirc: Set vs_position_always_invariant for Shadow of Mordor on Intel |
| - st/mesa: Add GL_TDFX_texture_compression_FXT1 support |
| - iris: Map FXT1 texture formats |
| - meson: Add a "prefer_iris" build option |
| - main: Change u_mmAllocMem align2 from bytes (old API) to bits (new |
| API) |
| - meson: Include iris in default gallium-drivers for x86/x86_64 |
| - util: Detect use-after-destroy in simple_mtx |
| - intel/genxml: Add a partial TCCNTLREG definition |
| - iris: Enable Gen11 Color/Z write merging optimization |
| - anv: Enable Gen11 Color/Z write merging optimization |
| - intel/decoder: Make get_state_size take a full 64-bit address and a |
| base |
| - iris: Create smaller program keys without legacy features |
| - iris: Default to X-tiling for scanout buffers without modifiers |
| - iris: Alphabetize source files after iris_perf.c was added |
| - drirc: Final Fantasy VIII: Remastered needs |
| allow_higher_compat_version |
| - iris: Make helper functions to turn iris shader keys into brw keys. |
| - iris: Fix shader recompile debug printing |
| - iris: Avoid replacing backing storage for buffers with no contents |
| - intel: Drop Gen11 WaBTPPrefetchDisable workaround |
| - st/nir: Optionally unify inputs_read/outputs_written when linking. |
| - iris: Set nir_shader_compiler_options::unify_interfaces. |
| - st/mesa: Allow ASTC5x5 fallbacks separately from other ASTC LDR |
| formats. |
| - iris: Disable ASTC 5x5 support on Gen9 for now. |
| - iris: Delete remnants of the unimplemented ASTC 5x5 workaround |
| - iris: Allow HiZ for copy_region sources |
| - anv: Only enable EWA LOD algorithm when doing anisotropic filtering. |
| - Revert "nir: assert that nir_lower_tex runs after lowering derefs" |
| - i965: Simplify brw_get_renderer_string() |
| - iris: Simplify iris_get_renderer_string() |
| - intel: Use similar brand strings to the Windows drivers |
| - intel/compiler: Fix illegal mutation in get_nir_image_intrinsic_image |
| - iris: Fix export of fences that have already completed. |
| - st/mesa: Allocate full miplevels if MaxLevel is explicitly set |
| - iris: Drop some workarounds which are no longer necessary |
| - anv: Drop some workarounds that are no longer necessary |
| - intel: Fix aux map alignments on 32-bit builds. |
| - meson: Prefer 'iris' by default over 'i965'. |
| - loader: Check if the kernel driver is i915 before loading iris |
| - iris: Drop 'engine' from iris_batch. |
| - iris: Make iris_emit_default_l3_config pull devinfo from the batch |
| - iris: Support multiple chained batches. |
| - i965: Use brw_batch_references in tex_busy check |
| - loader: Fix leak of kernel driver name |
| |
| Kristian Høgsberg (62): |
| |
| - freedreno/registers: Fix typo |
| - freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST |
| - freedreno/registers: Add comments about primitive counters |
| - freedreno/a6xx: Fix primitive counters again |
| - freedreno/a6xx: Clear sysmem with CP_BLIT |
| - freedreno: Add nogmem debug option to force bypass rendering |
| - freedreno/a6xx: Fix layered texture type enum |
| - freedreno/a6x: Rename z/s formats |
| - freedreno/a6xx: Add register offset for STG/LDG |
| - freedreno/ir3: Emit link map as byte or dwords offsets as needed |
| - freedreno/ir3: Add load and store intrinsics for global io |
| - freedreno: Don't count primitives for patches |
| - freedreno/ir3: Add ir3 intrinsics for tessellation |
| - freedreno/ir3: Use imul24 in offset calculations |
| - freedreno/ir3: Add tessellation field to shader key |
| - freedreno/ir3: Extend geometry lowering pass to handle tessellation |
| - freedreno/ir3: Add new synchronization opcodes |
| - freedreno/ir3: End TES with chsh when using GS |
| - freedreno/ir3: Implement tess coord intrinsic |
| - freedreno/ir3: Implement TCS synchronization intrinsics |
| - freedreno/ir3: Setup inputs and outputs for tessellation stages |
| - freedreno/ir3: Don't assume binning shader is always VS |
| - freedreno/ir3: Pre-color TCS header and primitive ID inputs |
| - freedreno/ir3: Allocate const space for tessellation parameters |
| - freedreno/a6xx: Build the right draw command for tessellation |
| - freedreno/a6xx: Allocate and program tessellation buffer |
| - freedreno/a6xx: Emit constant parameters for tessellation stages |
| - freedreno/a6xx: Program state for tessellation stages |
| - freedreno: Use bypass rendering for tessellation |
| - freedreno/a6xx: Only set emit.hs/ds when we're drawing patches |
| - freedreno/blitter: Save tessellation state |
| - freedreno/a6xx: Only use merged regs and four quads for VS+FS |
| - freedreno/a6xx: Turn on tessellation shaders |
| - freedreno/ir3: Use regid() helper when setting up precolor regs |
| - freedreno/registers: Remove duplicate register definitions |
| - freedreno: New struct packing macros |
| - freedreno/registers: Add 64 bit address registers |
| - freedreno/a6xx: Drop stale include |
| - freedreno/a6xx: Include fd6_pack.h in a few files |
| - freedreno/a6xx: Convert emit_mrt() to OUT_REG() |
| - freedreno/a6xx: Convert emit_zs() to OUT_REG() |
| - freedreno/a6xx: Convert VSC pipe setup to OUT_REG() |
| - freedreno/a6xx: Convert gmem blits to OUT_REG() |
| - freedreno/a6xx: Convert some tile setup to OUT_REG() |
| - freedreno/a6xx: Silence warning for unused perf counters |
| - freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits |
| - freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks |
| - freedreno: Add debug flag for forcing linear layouts |
| - freedreno/a6xx: Program sampler swap based on resource tiling |
| - freedreno/a6xx: Pick blitter swap based on resource tiling |
| - freedreno/a6xx: Add fd_resource_swap() helper |
| - freedreno/a6xx: Use blitter for resolve blits |
| - freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX |
| - freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro |
| - freedreno/a6xx: Handle srgb blits on the blitter |
| - freedreno/a6xx: Move handle_rgba_blit() up |
| - freedreno/a6xx: Rewrite compressed blits in a helper function |
| - freedreno/a6xx: Set up multisample sysmem MRTs correctly |
| - st/mesa: Lower vars to ssa and constant prop before |
| gl_nir_lower_buffers |
| - ir3: Set up full/half register conflicts correctly |
| - iris: Advertise PIPE_CAP_NATIVE_FENCE_FD |
| - iris: Print warning and return \*out = NULL when fd to syncobj fails |
| |
| Krzysztof Raszkowski (10): |
| |
| - gallium/swr: Fix GS invocation issues - Fixed proper setting |
| gl_InvocationID. - Fixed GS vertices output memory overflow. |
| - gallium/swr: Enable some ARB_gpu_shader5 extensions Enable / add to |
| features.txt: - Enhanced textureGather. - Geometry shader instancing. |
| - Geometry shader multiple streams. |
| - gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 |
| format. |
| - gallivm: add TGSI bit arithmetic opcodes support |
| - gallium/swr: Fix glVertexPointer race condition. |
| - gallium/swr: Disable showing detected arch message. |
| - docs/GL4: update gallium/swr features |
| - gallium/swr: add option for static link |
| - gallium/swr: Fix gcc 4.8.5 compile error |
| - gallium/swr: simplify environmental variabled expansion code |
| |
| Lasse Lopperi (1): |
| |
| - freedreno/drm: Fix memory leak in softpin implementation |
| |
| Laurent Carlier (1): |
| |
| - egl: avoid local modifications for eglext.h Khronos standard header |
| file |
| |
| Leo Liu (1): |
| |
| - ac: add missing Arcturus to the info of pc lines |
| |
| Lepton Wu (2): |
| |
| - gallium: dri2: Use index as plane number. |
| - android: mesa: Revert "android: mesa: revert "Enable asm |
| unconditionally"" |
| |
| Lionel Landwerlin (60): |
| |
| - intel/dev: set default num_eu_per_subslice on gen12 |
| - intel/perf: add TGL support |
| - intel/perf: fix Android build |
| - mesa: check draw buffer completeness on |
| glClearBufferfi/glClearBufferiv |
| - vulkan: bump headers/registry to 1.1.127 |
| - anv: Properly handle host query reset of performance queries |
| - anv: implement VK_KHR_separate_depth_stencil_layouts |
| - mesa: check framebuffer completeness only after state update |
| - anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit |
| - anv: remove list items on batch fini |
| - anv: detach batch emission allocation from device |
| - anv: expose timeout helpers outside of anv_queue.c |
| - anv: move queue init/finish to anv_queue.c |
| - anv: allow NULL batch parameter to anv_queue_submit_simple_batch |
| - anv: prepare driver to report submission error through queues |
| - anv: refcount semaphores |
| - anv: prepare the driver for delayed submissions |
| - anv/wsi: signal the semaphore in the acquireNextImage |
| - anv: implement VK_KHR_timeline_semaphore |
| - intel/dev: flag the Elkhart Lake platform |
| - intel/perf: add EHL performance query support |
| - intel/perf: fix invalid hw_id in query results |
| - intel/perf: set read buffer len to 0 to identify empty buffer |
| - intel/perf: take into account that reports read can be fairly old |
| - intel/perf: simplify the processing of OA reports |
| - intel/perf: fix improper pointer access |
| - anv: fix missing gen12 handling |
| - anv: fix incorrect VMA alignment for CCS main surfaces |
| - anv: fix fence underlying primitive checks |
| - anv: fix assumptions about temporary fence payload |
| - intel/perf: drop batchbuffer flushing at query begin |
| - i965/iris: perf-queries: don't invalidate/flush 3d pipeline |
| - anv: constify pipeline layout in nir passes |
| - anv: drop unused parameter from apply layout pass |
| - vulkan/wsi: error out when image fence doesn't signal |
| - mesa: avoid triggering assert in implementation |
| - i965/iris/perf: factor out frequency register capture |
| - loader: fix close on uninitialized file descriptor value |
| - anv: don't close invalid syncfd semaphore |
| - anv: fix intel perf queries availability writes |
| - anv: set stencil layout for input attachments |
| - iris: Implement Gen12 workaround for non pipelined state |
| - anv: Implement Gen12 workaround for non pipelined state |
| - anv: only use VkSamplerCreateInfo::compareOp if enabled |
| - anv: fix pipeline switch back for non pipelined states |
| - genxml: add new Gen11+ PIPE_CONTROL field |
| - iris: handle new PIPE_CONTROL field |
| - iris: implement another workaround for non pipelined states |
| - anv: implement another workaround for non pipelined states |
| - intel/perf: expose timestamp begin for mdapi |
| - intel/perf: report query split for mdapi |
| - anv: enable VK_KHR_swapchain_mutable_format |
| - anv: don't report error with other vendor DRM devices |
| - anv: ensure prog params are initialized with 0s |
| - anv/iris: warn gen12 3DSTATE_HS restriction |
| - intel: Implement Gen12 workaround for array textures of size 1 |
| - isl: drop CCS row pitch requirement for linear surfaces |
| - isl: add gen12 comment about CCS for linear tiling |
| - anv: implement gen9 post sync pipe control workaround |
| - anv: set MOCS on push constants |
| |
| Luis Mendes (1): |
| |
| - radv: fix radv secure compile feature breaks compilation on armhf |
| EABI and aarch64 |
| |
| Marco Felsch (1): |
| |
| - etnaviv: Fix assert when try to accumulate an invalid fd |
| |
| Marek Olšák (245): |
| |
| - glsl: encode/decode types using a union with bitfields for |
| readability |
| - glsl: encode vector_elements and matrix_columns better |
| - glsl: encode explicit_stride for basic types better |
| - glsl: encode array types better |
| - glsl: encode struct/interface types better |
| - st/mesa: call nir_opt_access only once |
| - st/mesa: call nir_lower_flrp only once per shader |
| - compiler: make variable::data::binding unsigned |
| - nir: pack nir_variable::data::stream |
| - nir: pack nir_variable::data::xfb\_\* |
| - radeonsi: use IR SHA1 as the cache key for the in-memory shader cache |
| - radeonsi: don't keep compute shader IR after compilation |
| - radeonsi: keep serialized NIR instead of nir_shader in |
| si_shader_selector |
| - nir: pack the rest of nir_variable::data |
| - nir/serialize: don't expand 16-bit variable state slots to 32 bits |
| - nir/serialize: store 32-bit object IDs instead of 64-bit |
| - nir/serialize: pack nir_variable flags |
| - mesa: expose SPIR-V extensions in the Compatibility profile too |
| - util: add blob_finish_get_buffer |
| - radeonsi/nir: call nir_serialize only once per shader |
| - radeonsi/nir: fix compute shader crash due to nir_binary == NULL |
| - glsl/linker: pass shader_info to analyze_clip_cull_usage directly |
| - compiler: pack shader_info from 160 bytes to 96 bytes |
| - st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for |
| them |
| - st/mesa: rename DEBUG_TGSI -> DEBUG_PRINT_IR |
| - st/mesa: remove \\n being only printed in debug builds after printed |
| TGSI |
| - st/mesa: print TCS/TES/GS/CS TGSI in the right place & keep disk |
| cache enabled |
| - st/mesa: add ST_DEBUG=nir to print NIR shaders |
| - st/mesa: remove unused TGSI-only debug printing functions |
| - gallium/noop: call finalize_nir |
| - radeonsi/nir: remove dead function temps |
| - radeonsi/nir: call nir_lower_flrp only once per shader |
| - radeonsi/nir: don't lower fma, instead, fuse fma |
| - mesa: enable glthread for 7 Days To Die |
| - st/mesa: rename delete_basic_variant -> delete_common_variant |
| - st/mesa: decrease the size of st_fp_variant_key from 48 to 40 bytes |
| - st/mesa: start deduplicating some program code |
| - st/mesa: initialize affected_states and uniform storage earlier in |
| deserialize |
| - st/mesa: consolidate and simplify code flagging |
| program::affected_states |
| - st/mesa: trivially merge st_vertex_program into st_common_program |
| - st/mesa: rename st_common_program to st_program |
| - st/mesa: cleanups after unification of st_vertex/common program |
| - st/mesa: rename occurences of stcp to stp to correspond to st_program |
| - st/mesa: more cleanups after unification of st_vertex/common_program |
| - st/mesa: subclass st_vertex_program for VP-specific members |
| - st/mesa: call nir_sweep in st_finalize_nir |
| - st/mesa: keep serialized NIR instead of nir_shader in st_program |
| - st/mesa: call nir_serialize only once per shader |
| - nir: move data.image.access to data.access |
| - nir/print: only print image.format for image variables |
| - glsl_to_nir: rename image_access to mem_access |
| - nir: move data.descriptor_set above data.index for better packing |
| - nir: don't use GLenum16 in nir.h |
| - ac: add radeon_info::num_rings and move ring_type to amd_family.h |
| - ac: fill num_rings for remaining IPs |
| - winsys/amdgpu: detect noop dependencies on the same ring correctly |
| - nir: strip as we serialize to remove the nir_shader_clone call |
| - nir/serialize: do ctx = {0} instead of manual initializations |
| - util/blob: add 8-bit and 16-bit reads and writes |
| - nir/serialize: pack instructions better |
| - nir/serialize: pack src better and limit the object count to 1M from |
| 1G |
| - nir/serialize: don't serialize var->data for temporaries |
| - nir/serialize: deduplicate serialized var types by reusing the last |
| unique one |
| - nir/serialize: try to store a diff in var data locations instead of |
| var data |
| - nir/serialize: pack load_const with non-64-bit constants better |
| - nir/serialize: pack 1-component constants into 20 bits if possible |
| - nir/serialize: pack nir_intrinsic_instr::const_index[] better |
| - nir/serialize: try to pack two alu srcs into 1 uint32 |
| - nir/serialize: don't store deref types if not needed |
| - nir/serialize: don't serialize mode for deref non-cast instructions |
| - nir/serialize: try to put deref->var index into the unused bits of |
| the header |
| - nir/serialize: cleanup - fold nir_deref_type_var cases into switches |
| - nir/serialize: try to pack both deref array src into 32 bits |
| - nir/serialize: remove up to 3 consecutive equal ALU instruction |
| headers |
| - nir/serialize: reuse the writemask field for 2 src X swizzles of SSA |
| ALU |
| - nir/serialize: serialize swizzles for vec8 and vec16 |
| - nir/serialize: serialize writemask for vec8 and vec16 |
| - nir/serialize: don't serialize redundant |
| nir_intrinsic_instr::num_components |
| - nir/serialize: use 3 unused bits in intrinsic for |
| packed_const_indices |
| - nir/serialize: support any num_components for remaining instructions |
| - ac: set swizzled bit in cache policy as a hint not to merge |
| loads/stores |
| - radeonsi: initialize the per-context compiler on demand |
| - radeonsi/nir: don't run si_nir_opts again if there is no change |
| - st/mesa: don't serialize all streamout state if there are no SO |
| outputs |
| - st/mesa: don't use redundant stp->state.ir.nir |
| - st/mesa: don't call ProgramStringNotify in glsl_to_nir |
| - st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking |
| for NIR |
| - st/mesa: simplify looping over linked shaders when linking NIR |
| - st/mesa: don't use \*\* in the st_nir_link_shaders signature |
| - st/mesa: add st_variant base class to simplify code for shader |
| variants |
| - ac/nir: don't rely on data.patch for tess factors |
| - radeonsi/nir: implement subgroup system values for SPIR-V |
| - radeonsi: simplify the interface of |
| get_dw_address_from_generic_indices |
| - radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices |
| - radeonsi/nir: validate is_patch because SPIR-V doesn't set it for |
| tess factors |
| - radeonsi/nir: don't rely on data.patch for tess factors |
| - radeonsi/nir: fix location_frac handling for TCS outputs |
| - radeonsi/nir: support interface output types to fix SPIR-V xfb |
| piglits |
| - radeonsi: enable SPIR-V and GL 4.6 for NIR |
| - util/driconfig: print ATTENTION if MESA_DEBUG=silent is not set |
| - radeonsi/gfx10: simplify some duplicated NGG GS code |
| - radeonsi/gfx10: fix the vertex order for triangle strips emitted by a |
| GS |
| - llvmpipe: implement TEX_LZ and TXF_LZ opcodes |
| - gallivm: implement LOAD with CONSTBUF but don't enable it for |
| llvmpipe |
| - st/mesa: support UBOs for Selection/Feedback/RasterPos |
| - st/mesa: save currently bound vertex samplers and sampler views in |
| st_context |
| - st/mesa: support samplers for Selection/Feedback/RasterPos |
| - st/mesa: support SSBOs for Selection/Feedback/RasterPos |
| - st/mesa: support shader images for Selection/Feedback/RasterPos |
| - st/mesa: use a separate VS variant for the draw module |
| - st/mesa: remove st_vp_variant::num_inputs |
| - st/mesa: remove struct st_vp_variant in favor of st_common_variant |
| - st/mesa: don't generate VS TGSI if NIR is enabled |
| - draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM |
| - st/mesa: release the draw shader properly to fix driver crashes |
| (iris) |
| - st/dri: assume external consumers of back buffers can write to the |
| buffers |
| - radeonsi: enable NIR by default and document GL 4.6 support |
| - radeonsi/gfx10: disable vertex grouping |
| - radeonsi/gfx10: simplify the tess_turns_off_ngg condition |
| - radeonsi: don't rely on CLEAR_STATE to set PA_SC_GENERIC_SCISSOR\_\* |
| - ac: fix ac_get_i1_sgpr_mask for Wave32 |
| - ac: fix the return value in cull_bbox when bbox culling is disabled |
| - radeonsi: deduplicate ES and GS thread enablement code |
| - radeonsi: disallow compute-based culling if polygon mode is enabled |
| - radeonsi: set is_monolithic for VS prologs when the shader is really |
| monolithic |
| - radeonsi: don't wrap the VS prolog in if (ES thread) .. endif |
| - radeonsi/gfx10: don't insert NGG streamout atomics if they are never |
| used |
| - radeonsi: allow generating VS prologs with 0 inputs |
| - radeonsi: fix determining whether the VS prolog is needed |
| - radeonsi: reset more fields in si_llvm_context_set_ir to fix reusing |
| ctx |
| - radeonsi/gfx10: fix ngg_get_ordered_id |
| - amd/addrlib: update to the latest version |
| - ac/surface: fix an assertion failure on gfx9 in CMASK computation |
| - radeonsi/gfx10: don't declare any LDS for NGG if it's not used |
| - radeonsi/gfx10: enable NGG passthrough for eligible shaders |
| - radeonsi/gfx10: improve performance for TES using PrimID but not |
| exporting it |
| - Revert "u_vbuf: Regard non-constant vbufs with non-instance elements |
| as free" |
| - winsys/radeon: initialize pte_fragment_size |
| - radeonsi: preserve the scanout flag for shared resources on gfx9 and |
| gfx10 |
| - radeonsi: ignore PIPE_BIND_SCANOUT for imported textures |
| - radeonsi: remove the "display_dcc_offset == 0" assertion |
| - radeonsi: rename SDMA debug flags |
| - radeonsi: remove broken and unused SI SDMA image copy code |
| - radeonsi: add AMD_DEBUG=nodmaclear for debugging |
| - radeonsi: add AMD_DEBUG=nodmacopyimage for debugging |
| - radeonsi: rename dma_cs -> sdma_cs |
| - radeonsi: move SI and CIK+ SDMA code into 1 common function for |
| cleanups |
| - radeonsi: disable SDMA on gfx8 to fix corruption on RX 580 |
| - radeonsi: remove TGSI |
| - gallium: put u_vbuf_get_caps return values into u_vbuf_caps |
| - gallium/cso_context: move non-vbuf vertex buffer and element code |
| into helpers |
| - gallium: bypass u_vbuf if it's not needed (no fallbacks and no user |
| VBOs) |
| - ac/gpu_info: always use distributed tessellation on gfx10 |
| - radeonsi: fix monolithic pixel shaders with two-sided colors and |
| SampleMaskIn |
| - radeonsi: fix context roll tracking in si_emit_shader_vs |
| - radeonsi: test polygon mode enablement accurately |
| - radeonsi: determine accurately if line stippling is enabled for |
| performance |
| - radeonsi: clean up messy si_emit_rasterizer_prim_state |
| - ac: unify build_sendmsg_gs_alloc_req |
| - ac: unify primitive export code |
| - ac/gpu_info: add pc_lines and use it in radeonsi |
| - ac: add 128-bit bitcount |
| - ac: add ac_build_s_endpgm |
| - radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly |
| on gfx9 |
| - radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size |
| - radeonsi: add si_context::num_vertex_elements |
| - radeonsi: don't allow draw calls with uninitialized VS inputs |
| - radeonsi: simplify si_set_vertex_buffers |
| - ac,radeonsi: increase the maximum number of shader args and return |
| values |
| - radeonsi: put up to 5 VBO descriptors into user SGPRs |
| - radeonsi: don't enable VBOs in user SGPRs if compute-based culling |
| can be used |
| - radeonsi: fix assertion and other failures in |
| si_emit_graphics_shader_pointers |
| - radeonsi: actually enable VBOs in user SGPRs |
| - radeonsi: don't adjust depth and stencil PS output locations |
| - radeonsi: rename DBG_NO_TGSI -> DBG_NO_NIR |
| - radeonsi: remove TGSI from comments |
| - radeonsi: rename si_shader_info -> si_shader_binary_info |
| - radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info |
| - radeonsi: merge si_tessctrl_info into si_shader_info |
| - radeonsi: clean up si_shader_info |
| - radeonsi: rename si_compile_tgsi_main -> si_build_main_function |
| - radeonsi: rename si_shader_create -> si_create_shader_variant for |
| clarity |
| - radeonsi: fold si_create_function into si_llvm_create_func |
| - radeonsi: remove always constant ballot_mask_bits from |
| si_llvm_context_init |
| - radeonsi: move PS LLVM code into si_shader_llvm_ps.c |
| - radeonsi: separate code computing info for small primitive culling |
| - ac/cull: don't read Position.Z if it's not needed for culling |
| - radeonsi: make si_insert_input\_\* functions non-static |
| - radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make |
| space there |
| - radeonsi/gfx10: separate code for getting edgeflags from the |
| gs_invocation_id VGPR |
| - radeonsi/gfx10: separate code for determining the number of vertices |
| for NGG |
| - radeonsi: fix si_build_wrapper_function for compute-based primitive |
| culling |
| - radeonsi: work around an LLVM crash when using |
| llvm.amdgcn.icmp.i64.i1 |
| - radeonsi: move si_insert_input\_\* functions |
| - radeonsi: move tessellation shader code into si_shader_llvm_tess.c |
| - radeonsi: remove llvm_type_is_64bit |
| - radeonsi: move geometry shader code into si_shader_llvm_gs.c |
| - radeonsi: move code for shader resources into |
| si_shader_llvm_resources.c |
| - radeonsi: remove useless #includes |
| - radeonsi: merge si_compile_llvm and si_llvm_compile functions |
| - gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES |
| - st/dri: do FLUSH_VERTICES before calling flush_resource |
| - Revert "radeonsi: unbind image before compute clear" |
| - radeonsi: clean up how internal compute dispatches are handled |
| - radeonsi: don't invoke decompression inside internal launch_grid |
| - radeonsi: fix doubles and int64 |
| - radeonsi: turn an assertion into return in si_nir_store_output_tcs |
| - ac: add prefix bitcount functions |
| - ac: add ac_build_readlane without optimization barrier |
| - radeonsi/gfx10: update comments and remove invalid TODOs |
| - radeonsi/gfx10: correct VS PrimitiveID implementation for NGG |
| - radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of |
| shaders |
| - radeonsi/gfx10: export primitives at the beginning of VS/TES |
| - radeonsi/gfx10: merge main and pos/param export IF blocks into one if |
| possible |
| - radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough |
| - radeonsi/gfx10: move GE_PC_ALLOC setting to shader states |
| - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups |
| - ac: add helper ac_build_triangle_strip_indices_to_triangle |
| - radeonsi/gfx10: rewrite late alloc computation |
| - radeonsi/gfx10: enable GS fast launch for triangles and strips with |
| NGG culling |
| - radeonsi: use ctx->ac. for types and integer constants |
| - radeonsi: move non-LLVM code out of si_shader_llvm.c |
| - radeonsi: move VS shader code into si_shader_llvm_vs.c |
| - radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c |
| - radeonsi: minor cleanup in si_shader_internal.h |
| - radeonsi: move si_nir_build_llvm into si_shader_llvm.c |
| - radeonsi: fold si_shader_context_set_ir into si_build_main_function |
| - radeonsi: move more LLVM functions into si_shader_llvm.c |
| - radeonsi: make si_compile_llvm return bool |
| - radeonsi: make si_compile_shader return bool |
| - radeonsi: change prototypes of si_is_multi_part_shader & |
| si_is_merged_shader |
| - radeonsi: separate LLVM compilation from non-LLVM code |
| - util/simple_mtx: add a missing include to get ASSERTED |
| - gallium/util: add a cache of live shaders for shader CSO |
| deduplication |
| - radeonsi: use the live shader cache |
| - radeonsi: restructure si_shader_cache_load_shader |
| - radeonsi: print shader cache stats with AMD_DEBUG=cache_stats |
| - radeonsi: expose shader cache stats to the HUD |
| - radeonsi: make screen available to shader part compilation |
| - radeonsi: fix a regression since the addition of si_shader_llvm_vs.c |
| - Revert "winsys/amdgpu: Close KMS handles for other DRM file |
| descriptions" |
| - Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible" |
| - radeonsi: don't report that multi-plane formats are supported |
| - radeonsi: fix the DCC MSAA bug workaround |
| - radeonsi: don't wait for shader compilation to finish when destroying |
| a context |
| |
| Marek Vasut (5): |
| |
| - etnaviv: Replace bitwise OR with logical OR |
| - etnaviv: tgsi: Fix gl_FrontFacing support |
| - etnaviv: Report correct number of vertex buffers |
| - etnaviv: Do not filter out PIPE_FORMAT_S8_UINT_Z24_UNORM on |
| pre-HALTI2 |
| - etnaviv: Destroy rsc->pending_ctx set in etna_resource_destroy() |
| |
| Mark Janes (3): |
| |
| - Revert "st/mesa: call nir_serialize only once per shader" |
| - Revert "st/mesa: keep serialized NIR instead of nir_shader in |
| st_program" |
| - iris: separating out common perf code |
| |
| Markus Wick (3): |
| |
| - mapi/glapi: Generate sizeof() helpers instead of fixed sizes. |
| - mesa/glthread: Implement ARB_multi_bind. |
| - drirc: Enable glthread for dolphin/citra/yuzu. |
| |
| Martin Fuzzey (1): |
| |
| - etnaviv: update Android build files |
| |
| Mathias Fröhlich (1): |
| |
| - egl: Implement getImage/putImage on pbuffer swrast. |
| |
| Matt Turner (19): |
| |
| - intel/compiler: Use ARRAY_SIZE() |
| - intel/compiler: Extract GEN\_\* macros into separate file |
| - intel/compiler: Split has_64bit_types into float/int |
| - intel/compiler: Don't disassemble align1 3-src operands on Gen < 10 |
| - intel/compiler: Limit compaction unit tests to specific gens |
| - intel/compiler: Add NF some more places |
| - intel/compiler: Add a INVALID_{,HW_}REG_TYPE macros |
| - intel/compiler: Split hw_type tables |
| - intel/compiler: Handle invalid inputs to brw_reg_type_to_*() |
| - intel/compiler: Handle invalid compacted immediates |
| - intel/compiler: Factor out brw_validate_instruction() |
| - intel/compiler: Validate some instruction word encodings |
| - intel/compiler: Add unit tests for new EU validation checks |
| - intel/compiler: Validate fuzzed instructions |
| - intel/compiler: Test compaction on Gen <= 12 |
| - gitlab-ci: Skip ext_timer_query/time-elapsed |
| - intel/compiler: Move Gen4/5 rounding to visitor |
| - util: Explain BITSET_FOREACH_SET params |
| - util: Remove tmp argument from BITSET_FOREACH_SET macro |
| |
| Mauro Rossi (9): |
| |
| - android: aco: fix Lower to CSSA |
| - android: radeonsi: fix build error due to wrong u_format.csv file |
| path |
| - android: util/format: fix include path list |
| - android: radeonsi: fix build after vl refactoring (v2) |
| - android: nir: add a load/store vectorization pass |
| - android: util: Add a mapping from VkFormat to PIPE_FORMAT. |
| - android: radv: fix vk_format_table.c generated source build |
| - android: radeonsi,ac: fix building error due to ac changes |
| - android: radv: build radv_shader_args.c |
| |
| Michel Dänzer (36): |
| |
| - gitlab-ci: Set arm job CCACHE_DIR properly |
| - gitlab-ci: Use separate arm64 build/test docker images |
| - gitlab-ci: Don't build libdrm for ARM |
| - gitlab-ci: Use ninja -j4 for building dEQP |
| - gitlab-ci: Move artifact preparation to separate script |
| - gitlab-ci: Share dEQP build process between x86 & ARM test image |
| scripts |
| - gitlab-ci: Sort packages in debian-install.sh |
| - gitlab-ci: Run piglit tests with llvmpipe |
| - gitlab-ci: Use separate docker images for x86 build/test jobs |
| - gitlab-ci: Delete install/bin from artifacts as well |
| - gitlab-ci: Document that ci-templates refs must be in sync |
| - gitlab-ci: Use functional container job names |
| - gitlab-ci: Rename container install scripts to match job names |
| (better) |
| - gitlab-ci: Organize images using new REPO_SUFFIX templates feature |
| - gitlab-ci: Directly use host-mapped directory for ccache |
| - gitlab-ci: Stop reporting piglit test results via JUnit |
| - gitlab-ci: Stop storing piglit test results as JUnit |
| - gitlab-ci: Put HTML summary in artifacts for failed piglit jobs |
| - gitlab-ci: Update to current ci-templates master |
| - gitlab-ci: Run piglit glslparser & quick_shader tests separately |
| - glsl/tests: Use splitlines() instead of strip() |
| - gitlab-ci: Use the common run policy for LAVA jobs as well again |
| - gitlab-ci: Overhaul job run policy |
| - gitlab-ci: Don't exclude any piglit quick_shader tests |
| - gitlab-ci: Test against LLVM / clang 9 on x86 |
| - gitlab-ci: Stop using manual jobs for merge requests |
| - gitlab-ci: Set GIT_STRATEGY to none for the dummy job |
| - gitlab-ci: Use single if for manual job rules entry |
| - winsys/amdgpu: Keep a list of amdgpu_screen_winsyses in amdgpu_winsys |
| - winsys/amdgpu: Keep track of retrieved KMS handles using hash tables |
| - winsys/amdgpu: Only re-export KMS handles for different DRM FDs |
| - util: Add os_same_file_description helper |
| - winsys/amdgpu: Re-use amdgpu_screen_winsys when possible |
| - winsys/amdgpu: Close KMS handles for other DRM file descriptions |
| - winsys/amdgpu: Re-use amdgpu_screen_winsys when possible |
| - winsys/amdgpu: Close KMS handles for other DRM file descriptions |
| |
| Michel Zou (3): |
| |
| - Meson: Check for dladdr with MinGW |
| - disk_cache_get_function_timestamp: check for dladdr |
| - Meson: Add llvm>=9 modules |
| |
| Miguel Casas-Sanchez (1): |
| |
| - i965: Ensure that all 2101010 image imports can pass framebuffer |
| completeness. |
| |
| Nanley Chery (3): |
| |
| - gallium/dri2: Fix creation of multi-planar modifier images |
| - gallium: Store the image format in winsys_handle |
| - iris: Fix import of multi-planar surfaces with modifiers |
| |
| Nataraj Deshpande (1): |
| |
| - egl/android: Restrict minimum triple buffering for android |
| color_buffers |
| |
| Nathan Kidd (1): |
| |
| - llvmpipe: Check thread creation errors |
| |
| Neha Bhende (3): |
| |
| - st/mesa: release tgsi tokens for shader states |
| - svga: fix size of format_conversion_table[] |
| - svga: Use pipe_shader_state_from_tgsi to set shader state |
| |
| Neil Armstrong (3): |
| |
| - Add support for T820 CI Jobs |
| - ci: Remove T820 from CI temporarily |
| - gitlab-ci/lava: add pipeline information in the lava job name |
| |
| Neil Roberts (9): |
| |
| - nir/opcodes: Add a helper function to generate the comparison binops |
| - nir/opcodes: Add a helper function to generate reduce opcodes |
| - nir: Add a 16-bit bool type |
| - nir: Add a 8-bit bool type |
| - nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops |
| - freedreno/ir3: Support 16-bit comparison instructions |
| - freedreno/ir3: Add implementation of nir_op_b16csel |
| - freedreno/ir3: Implement f2b16 and i2b16 |
| - freedreno/ir3: Enabling lowering 16-bit flrp |
| |
| Paul Cercueil (5): |
| |
| - kmsro: Extend to include ingenic-drm |
| - u_vbuf: Mark vbufs incompatible if more were requested than HW |
| supports |
| - u_vbuf: Only create driver CSO if no incompatible elements |
| - u_vbuf: Regard non-constant vbufs with non-instance elements as free |
| - u_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below |
| minimum |
| |
| Paul Gofman (1): |
| |
| - state_tracker: Handle texture view min level in st_generate_mipmap() |
| |
| Paulo Zanoni (2): |
| |
| - intel/compiler: remove the operand restriction for src1 on GLK |
| - intel/compiler: fix nir_op_{i,u}*32 on ICL |
| |
| Peng Huang (1): |
| |
| - radeonsi: make si_fence_server_signal flush pipe without work |
| |
| Philipp Sieweck (1): |
| |
| - svga: check return value of define_query_vgpu{9,10} |
| |
| Pierre Moreau (4): |
| |
| - compiler/spirv: Fix uses of gnu struct = {} extension |
| - include/CL: Update OpenCL headers to latest |
| - clover: Use the dispatch table type from the OpenCL headers |
| - clover/meson: Define OpenCL header macros |
| |
| Pierre-Eric Pelloux-Prayer (54): |
| |
| - radeonsi: tell the shader disk cache what IR is used |
| - mesa: enable msaa in clear_with_quad if needed |
| - mesa: pass vao as a function paramter |
| - mesa: add EXT_dsa glVertexArray\* functions declarations |
| - mesa: rework \_mesa_lookup_vao_err to allow usage from EXT_dsa |
| - mesa: add vao/vbo lookup helper for EXT_dsa |
| - mesa: add EXT_dsa glVertexArray\* functions implementation |
| - mesa: add gl_vertex_array_object parameter to client state helpers |
| - mesa: add EXT_dsa glEnableVertexArrayEXT / glDisableVertexArrayEXT |
| - mesa: add EXT_dsa EnableVertexArrayAttribEXT / |
| DisableVertexArrayAttribEXT |
| - mesa: extract helper function from \_mesa_GetPointerv |
| - mesa: add EXT_dsa glGetVertexArray\* 4 functions |
| - mesa: fix call to \_mesa_lookup_vao_err |
| - radeonsi: fix shader disk cache key |
| - radeonsi: enable mesa_glthread for GfxBench |
| - mesa: update features.txt to reflect EXT_dsa status |
| - mesa: add ARB_framebuffer_no_attachments named functions |
| - mesa: add ARB_vertex_attrib_64bit VertexArrayVertexAttribLOffsetEXT |
| - mesa: add ARB_clear_buffer_object named functions |
| - mesa: add ARB_gpu_shader_fp64 selector-less functions |
| - mesa: add ARB_instanced_arrays EXT_dsa function |
| - mesa: add ARB_texture_buffer_range glTextureBufferRangeEXT function |
| - mesa: implement ARB_texture_storage_multisample + EXT_dsa functions |
| - mesa: extend vertex_array_attrib_format to support EXT_dsa |
| - mesa: add ARB_vertex_attrib_binding glVertexArray\* functions |
| - mesa: add ARB_sparse_buffer NamedBufferPageCommitmentEXT function |
| - mesa: enable EXT_direct_state_access |
| - mesa: fix warning in 32 bits build |
| - radeonsi: implement sdma for GFX9 |
| - radeonsi: display cs blit count for AMD_DEBUG=testdma |
| - radeonsi: use gfx9.surf_offset to compute texture offset |
| - radeonsi: fix multi plane buffers creation |
| - radeonsi: dcc dirty flag |
| - st/mesa: add a notify_before_flush callback param to flush |
| - st/dri: use st->flush callback to flush the backbuffer |
| - radeonsi: disable dcc for 2x MSAA surface and bpe < 4 |
| - gallium: refuse to create buffers larger than UINT32_MAX |
| - radeon/vcn2: enable rate control for hevc encoding |
| - radeonsi: check ctx->sdma_cs before using it |
| - radeonsi: release saved resources in si_retile_dcc |
| - radeonsi: release saved resources in si_compute_expand_fmask |
| - radeonsi: release saved resources in si_compute_clear_render_target |
| - radeonsi: release saved resources in si_compute_copy_image |
| - radeonsi: release saved resources in si_compute_clear_12bytes_buffer |
| - radeonsi: release saved resources in si_compute_do_clear_or_copy |
| - radeonsi: fix fmask expand compute shader |
| - radeonsi: make sure fmask expand is done if needed |
| - radeonsi: unbind image before compute clear |
| - radeonsi: drop the negation from fmask_is_not_identity |
| - util: call bind_sampler_states before setting sampler_views |
| - radeonsi: move AMD_DEBUG tests to AMD_TEST |
| - docs: document AMD_DEBUG variable |
| - radeonsi: stop using the VM_ALWAYS_VALID flag |
| - radeonsi/ngg: add VGT_FLUSH when enabling fast launch |
| |
| Prodea Alexandru-Liviu (2): |
| |
| - Meson: Remove lib prefix from graw and osmesa when building with |
| Mingw. Also remove version sufix from osmesa swrast on Windows. |
| - Appveyor: Quickly fix meson build. As this required use of Python |
| 3.8, mako module also had to be updated. |
| |
| Qiang Yu (3): |
| |
| - lima: sync lima_drm.h with kernel |
| - lima: create heap buffer with new interface if available |
| - lima: add noheap debug option |
| |
| Rafael Antognolli (23): |
| |
| - intel/isl: Add MOCS settings to isl_device. |
| - anv: Use mocs settings from isl_dev. |
| - iris: Use mocs from isl_dev. |
| - intel: Add workaround for stencil state. |
| - intel/genxml: Add 3DSTATE_CONSTANT_ALL packet. |
| - intel/aubinator: Decode 3DSTATE_CONSTANT_ALL. |
| - intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants. |
| - iris: Rework push constants emitting code. |
| - iris: Use 3DSTATE_CONSTANT_ALL when possible. |
| - anv: Move gen8+ push constant packet workaround. |
| - anv: Add get_push_range_address() helper. |
| - anv: Move code for emitting push constants into its own function. |
| - anv: Use 3DSTATE_CONSTANT_ALL when possible. |
| - iris: Add restriction to 3DSTATE_CONSTANT\_ packets. |
| - util/os_socket: Add socket related functions. |
| - vulkan/overlay: Add a control socket. |
| - vulkan/overlay: Add support for a control socket. |
| - vulkan/overlay: Add a command to start capturing data to a file. |
| - vulkan/overlay: Add basic overlay control script. |
| - vulkan/overlay: Update docs. |
| - iris: Implement WA for push constants. |
| - utils/os_socket: Define ssize_t on windows. |
| - intel: Load the driver even if I915_PARAM_REVISION is not found. |
| |
| Rhys Perry (131): |
| |
| - radv: adjust loop unrolling heuristics for int64 |
| - aco: add Instruction::usesModifiers() and add more checks in the |
| optimizer |
| - radv: fix radv_nir_get_max_workgroup_size when nir=NULL |
| - aco: use DPP instead of exec modification when lowering GFX10 |
| shuffles |
| - aco: fix shuffle with uniform operands |
| - nir/divergence: improve DA of shuffle |
| - aco: fix read_invocation with VGPR lane index |
| - aco: don't propagate vgprs into v_readlane/v_writelane |
| - aco: combine read_invocation and shuffle implementations |
| - radv: enable FP16/FP64 denormals earlier and only for LLVM |
| - aco: don't combine literals into v_cndmask_b32/v_subb/v_addc |
| - aco: fix 64-bit fsign with 0 |
| - aco: implement VK_KHR_shader_float_controls |
| - aco: refactor reduction lowering helpers |
| - aco: implement 64-bit integer reductions |
| - radv/aco: enable VK_KHR_shader_subgroup_extended_types |
| - nir: make nir_variable::{num_members,num_state_slots} a uint16_t |
| - nir: add nir_variable::index and nir_index_vars |
| - nir/large_constants: use nir_index_vars and nir_variable::index |
| - docs: update features.txt for RADV |
| - aco: improve waitcnt insertion around loops |
| - aco: fix copy+paste error |
| - aco: fix waitcnts for barriers at block ends |
| - nir: add nir_num_variable_modes and nir_var_mem_push_const |
| - radv: set alignment for load_ssbo/store_ssbo in meta shaders |
| - nir: add a load/store vectorization pass |
| - nir: add load/store vectorizer tests |
| - aco: enable load/store vectorizer |
| - aco: allow constant offsets for global/scratch instructions on GFX10 |
| - aco: set dlc/glc correctly for image loads |
| - aco: propagate p_wqm on an image_sample's coordinate p_create_vector |
| - aco: fix i2i64 |
| - aco: fix incorrect cast in parse_wait_instr() |
| - aco: add v_nop inbetween exec write and VMEM/DS/FLAT |
| - aco: improve WAR hazard workaround with >64bit stores |
| - aco: fix GFX10 opcodes for some global/flat atomics |
| - aco: fix assembly of FLAT/GLOBAL atomics |
| - aco: fix SADDR with FLAT on GFX10 |
| - aco: don't enable store_global for helper invocations |
| - aco: improve FLAT/GLOBAL scheduling |
| - aco: implement global atomics |
| - ac/llvm: fix pointer type for global atomics |
| - ac/llvm: improve sync scope for global atomics |
| - radv: set writes_memory for global memory stores/atomics |
| - aco: validate the CFG |
| - aco: handle loop exit and IF merge phis with break/discard |
| - aco: fix block_kind_discard s_andn2 definition to exec |
| - nir/lower_io_to_vector: don't create arrays when not needed |
| - nir/load_store_vectorize: fix combining stores with aliasing loads |
| between |
| - aco/wave32: fix comparison optimizations |
| - aco: improve jump threading with wave32 |
| - aco: fix vgpr alloc granule with wave32 |
| - aco: limit register usage for large work groups |
| - aco: set vm for pos0 exports on GFX10 |
| - aco: fix imageSize()/textureSize() with large buffers on GFX8 |
| - aco: fix uninitialized data in the binary |
| - aco: handle VOP3 modifiers when combining a constant comparison's NaN |
| test |
| - aco: handle omod successors with the constant in the first operand |
| - aco: check usesModifiers() when identifying a neg/abs |
| - aco: better handle neg/abs of sgprs |
| - aco: set exec_potentially_empty for demotes |
| - aco: don't DCE atomics with return values |
| - aco: disable add combining for ds_swizzle_b32 |
| - aco: check if multiplication/clamp is live when applying output |
| modifier |
| - nir/divergence: handle load_primitive_id in GS |
| - nir/lower_gs_intrinsics: add option for per-stream counts |
| - aco: update IR validator |
| - aco: apply literals to split mads |
| - aco: combine two sgprs into a VALU if they're the same |
| - aco: improve can_use_VOP3() |
| - aco: rewrite literal combining |
| - aco: rewrite apply_sgprs() |
| - aco: add check_vop3_operands() |
| - aco: be more careful with literals in combine_salu_{n2,lshl_add} |
| - aco: follow through temporary when merging tests into constant |
| comparisons |
| - aco: allow applying two sgprs to an instruction |
| - aco: allow an extra SGPR with multiple uses to be applied to VOP3 |
| - aco: take advantage of GFX10's constant bus limit and VOP3 literals |
| - aco: improve creation of v_madmk_f32/v_madak_f32 |
| - aco: fix clamp optimization |
| - aco: improve clamp optimization |
| - aco: add min(-max(), ) and max(-min(), ) optimization |
| - aco: don't move literal to reg when making an instruction VOP3 on |
| GFX10 |
| - aco: allow input modifiers on v_cndmask_b32 |
| - aco: replace extract_vector with copies |
| - aco: improve readfirstlane after uniform LDS loads |
| - aco: add integer min/max to can_swap_operands |
| - nir/sink,nir/move: move/sink load_per_vertex_input |
| - nir/sink,nir/move: move/sink nir_op_mov |
| - nir/algebraic: a & ~(a >> 31) -> imax(a, 0) |
| - aco: fix stack buffer overflow in apply_sgprs() |
| - aco: fix fall-through test in try_remove_simple_block() with |
| back-edges |
| - aco: fix operand kill flags when a temporary is used more than once |
| - aco: fix off-by-one error when initializing sgpr_live_in |
| - radv: move gs copy shader creation before other variants |
| - aco: improve support for s_sendmsg |
| - radv/aco,aco: implement GS on GFX9+ |
| - aco: implement GS on GFX7-8 |
| - radv/aco: allow ACO for GS |
| - aco: explicitly mark end blocks for exports |
| - aco: remove needs_instance_id |
| - aco: implement GS copy shaders |
| - radv/aco: use ACO for GS copy shaders |
| - aco: use nir_move_copies |
| - aco: fix WaR check for >64-bit FLAT/GLOBAL instructions |
| - aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb |
| - aco: always add sgprs to sgpr_ids when choosing literals |
| - aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc |
| - amd/common,radv: move vertex_format_table to ac_shader_util.{h,c} |
| - aco: rework vertex fetching a bit |
| - aco: skip unused channels at the start when fetching vertices |
| - aco: handle unaligned vertex fetch on GFX10 |
| - aco: value-number MUBUF instructions |
| - aco: use MUBUF in some situations instead of splitting vertex fetches |
| - aco: fix rebase error from GS copy shader support |
| - aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in |
| WQM |
| - aco: run p_wqm instructions in WQM |
| - nir/algebraic: add patterns for a >> #b << #b |
| - nir/algebraic: add some half packing optimizations |
| - aco: fix target calculation when vgpr spilling introduces sgpr |
| spilling |
| - aco: don't consider loop header blocks branch blocks in |
| add_coupling_code |
| - aco: don't update demand in add_coupling_code() for loop headers |
| - aco: only create parallelcopy to restore exec at loop exit if needed |
| - aco: don't always add logical edges from continue_break blocks to |
| headers |
| - aco: error when block has no logical preds but VGPRs are live at the |
| start |
| - aco: set exec_potentially_empty after continues/breaks in nested IFs |
| - aco: improve assertion at the end of spiller |
| - aco: fill reg_demand with sensible information in add_coupling_code() |
| - aco: parallelcopy exec mask before s_wqm |
| - aco: fix exec mask consistency issues |
| - aco: fix gfx10_wave64_bpermute |
| |
| Ricardo Garcia (1): |
| |
| - anv: Unify GetDeviceQueue and GetDeviceQueue2 |
| |
| Rob Clark (89): |
| |
| - freedreno/ir3: split pre-coloring to it's own function |
| - freedreno/ir3: use SSA flag on dest register too |
| - freedreno/ir3: ir3_print tweaks |
| - freedreno/ir3/ra: move regs_count==0 check |
| - freedreno/ir3/ra: remove ir print after livein/out |
| - freedreno/ir3: remove obsolete comment |
| - freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION |
| - freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION |
| - freedreno/ir3: sync disasm changes from envytools |
| - freedreno/ir3: also track # of nops for shader-db |
| - freedreno: fix eglDupNativeFenceFD error |
| - freedreno/ir3: fix valgrind complaint with STLW |
| - freedreno/ir3: remove half-precision output |
| - freedreno/ir3: rename fanin/fanout to collect/split |
| - freedreno/ir3: remove impossible condition |
| - freedreno/ir3: add input/output iterators |
| - freedreno/ir3: show input/output wrmask's in disasm |
| - freedreno/ir3: helper to print ir if debug enabled |
| - freedreno/ir3: remove first-vertex sysval |
| - freedreno/ir3: simplify creating sysval inputs |
| - freedreno/ir3: re-work shader inputs/outputs |
| - freedreno/ir3: only tex instructions have wrmask |
| - freedreno/ir3: fix gpu hang with pre-fs-tex-fetch |
| - freedreno/ir3: legalize cleanups |
| - freedreno/ir3: remove unused parameter |
| - freedreno/perfcntrs: small cleanup |
| - freedreno/perfcntrs: remove gallium dependencies |
| - freedreno/perfcntrs: move to shared location |
| - freedreno/perfcntrs: add accessor to get per-gen tables |
| - freedreno/perfctrs/a2xx: move CP to be first group |
| - freedreno/perfcntrs/a6xx: remove RBBM counters |
| - freedreno/perfcntrs: add fdperf |
| - freedreno/perfctrs/fdperf: periodically restore counters |
| - gitlab-ci: update deqp build so we can generate xml |
| - gitlab-ci/deqp: preserve full list of unexpected results |
| - gitlab-ci/deqp: preserve caselists for blocks with fails |
| - gitlab-ci/deqp: detect and report flakes |
| - gitlab-ci: bump arm test container |
| - gitlab-ci/deqp: generate xml results for fails/flakes |
| - gitlab-ci/deqp: generate junit results |
| - gitlab-ci/freedreno/a6xx: remove most of the flakes |
| - freedreno: use rsc->slice accessor everywhere |
| - freedreno: switch to layout helper |
| - gitlab-ci: disable junit results for deqp |
| - freedreno/ir3: remove store_output lowered to store_shared_ir3 |
| - freedreno/ir3: fix neverball assert in case of unused VS inputs |
| - nir/lower_clip: Fix incorrect driver loc for clipdist outputs |
| - freedreno/fdperf: use drmOpen() |
| - freedreno/a6xx: disable LRZ when blending |
| - freedreno/a5xx+a6xx: split LRZ layout to per-gen |
| - freedreno/a6xx: fix LRZ layout |
| - freedreno/a6xx: fix LRZ logic |
| - freedreno/a6xx: enable LRZ by default |
| - spirv: add OpLifetime\* |
| - freedreno/ir3: add last-baryf shaderdb stat |
| - freedreno/ir3: add scheduler traces |
| - freedreno/ir3: add iterator macros |
| - freedreno/a6xx: fix OUT_REG() vs growable cmdstream |
| - nir+vtn: vec8+vec16 support |
| - freedreno/ir3: fix flat shading again |
| - nir: assert that nir_lower_tex runs after lowering derefs |
| - mesa/st: lower samplers before nir_lower_tex |
| - freedreno/ir3: rename instructions |
| - gitlab-ci: fix missing caselist.css/xsl |
| - freedreno/a6xx: limit scratch/debug markers to debug builds |
| - freedreno/a6xx: cleanup rasterizer state |
| - freedreno/a6xx: separate rast stateobj for prim restart |
| - freedreno/a6xx: drop a few more per-draw registers |
| - freedreno/a6xx: move dynamic program state to streaming stateobj |
| - freedreno/a6xx: add PROG_FB_RAST stateobj |
| - freedreno/drm: fix invalid-cmdstream-size with older kernels |
| - freedreno: use PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND |
| - mesa/st: random whitespace cleanup |
| - freedreno/a6xx: remove special handling based on MRT format |
| - freedreno/a6xx: convert blend state to stateobj |
| - freedreno: extract vsc pipe bo from GMEM state |
| - freedreno: consolidate GMEM state |
| - freedreno: constify fd_tile |
| - freedreno: constify fd_vsc_pipe |
| - freedreno/a6xx: constify gmem state |
| - freedreno/a5xx: constify gmem state |
| - freedreno/a4xx: constify gmem state |
| - freedreno/a3xx: constify gmem state |
| - freedreno/a2xx: constify gmem state |
| - freedreno: get GMEM state from batch |
| - freedreno: add gmem state cache |
| - freedreno: add gmem_lock |
| - freedreno: remove flush-queue |
| - freedreno: allow ctx->batch to be NULL |
| |
| Robert Foss (5): |
| |
| - nir: Build nir_lower_point_size.c in libmesa_nir |
| - android: Add panfrost support to build scripts |
| - android: Fix u_format_table.c being generated twice |
| - panfrost: Prefix schedule_program to prevent collision |
| - android: Fix whitespace issue |
| |
| Rohan Garg (1): |
| |
| - gitlab-ci: Use lavacli from packages |
| |
| Roland Scheidegger (3): |
| |
| - gallium/scons: fix graw_gdi build |
| - util/atomic: Fix p_atomic_add for unlocked and msvc paths |
| - winsys/svga: use new ioctl for logging |
| |
| Roman Stratiienko (2): |
| |
| - Android: Fix build issue without LLVM |
| - panfrost: Fix Android build |
| |
| Ross Zwisler (1): |
| |
| - intel: limit shader geometry on BDW GT1 |
| |
| Sagar Ghuge (1): |
| |
| - intel/compiler: Clear accumulator register before EOT |
| |
| Samuel Iglesias Gonsálvez (1): |
| |
| - main: fix coverity error in \_mesa_program_resource_find_name() |
| |
| Samuel Pitoiset (202): |
| |
| - radv: declare NGG scratch for VS or TES and only on GFX10 |
| - radv: fix compute pipeline keys when optimizations are disabled |
| - docs: document all RADV environment variables |
| - radv: add a note about perftest/debug options |
| - radv: fix 32-bit compiler warnings |
| - nir: fix packing of nir_variable |
| - radv/gfx10: enable wave32 for compute based on shader's wavesize |
| - radv: hardcode the number of waves for the GFX6 LS-HS bug |
| - radv: determine shaders wavesize at pipeline level |
| - radv: rely on shader's wavesize when computing NGG info |
| - radv: implement VK_EXT_subgroup_size_control |
| - radv/gfx10: fix primitive indices orientation for NGG GS |
| - ac: handle pointer types to LDS in ac_get_elem_bits() |
| - gitlab-ci: build a specific libdrm version for ARM64 |
| - gitlab-ci: build RADV on ARM64 |
| - ac: fix build with recent LLVM |
| - radv: remove useless RADV_DEBUG=unsafemath debug option |
| - radv: make sure to not clear the ds attachment after resolves |
| - ac: add radeon_info::has_l2_uncached |
| - radv: implement VK_AMD_device_coherent_memory |
| - spirv: fix lowering of OpGroupNonUniformAllEqual |
| - ac: remove useless cast in ac_build_set_inactive() |
| - ac: add 8-bit and 16-bit supports to ac_build_shuffle() |
| - ac: add 8-bit and 16-bit supports to ac_build_readlane() |
| - ac: add 8-bit and 16-bit supports to ac_build_set_inactive() |
| - ac: add 8-bit and 16-bit supports to ac_build_dpp() |
| - ac: add 8-bit and 16-bit supports to ac_build_swizzle() |
| - ac: add 8-bit and 16-bit supports to get_reduction_identity() |
| - ac: add 8-bit and 16-bit supports to ac_build_wwm() |
| - ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier() |
| - ac: add 16-bit float support to ac_build_alu_op() |
| - radv: advertise VK_KHR_shader_subgroup_extended_types on GFX8-GFX9 |
| - radv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7 |
| - docs: add missing new features for RADV |
| - pipe-loader: check that the pointer to driconf_xml isn't NULL |
| - gitlab-ci: move building piglit into a separate script |
| - gitlab-ci: fix ldd check for Vulkan drivers |
| - gitlab-ci: add a job that only build things needed for testing |
| - gitlab-ci: do not build with debugoptimized for meson-main |
| - gitlab-ci: build swr in meson-main |
| - gitlab-ci: build GLVND in meson-clang |
| - gitlab-ci: remove now useless meson-swr-glvnd build job |
| - gitlab-ci: reduce the number of scons build |
| - radv: disable subgroup shuffle operations on GFX10 |
| - ac/llvm: fix the local invocation index for wave32 |
| - meson: only build imgui when needed |
| - radv: set the image view aspect mask during subpass transitions |
| - radv: set the image view aspect mask before resolves |
| - radv: rework creation of decompress/resummarize meta pipelines |
| - radv: create decompress pipelines for separate depth/stencil layouts |
| - radv: select the depth decompress path based on the aspect mask |
| - ac/llvm: fix warning in ac_build_canonicalize() |
| - radv: fix reporting subgroup size with |
| VK_KHR_pipeline_executable_properties |
| - radv: fix enabling sample shading with SampleID/SamplePosition |
| - radv/gfx10: fix implementation of exclusive scans |
| - ac: add 8-bit and 16-bit supports to ac_build_permlane16() |
| - radv: enable VK_KHR_shader_subgroup_extended_types on GFX10 |
| - ac/llvm: convert src operands to pointers if necessary |
| - radv: add more constants to avoid using magic numbers |
| - radv,ac/nir: lower deref operations for shared memory |
| - aco: drop useless lowering of deref operations for shared memory |
| - ac/llvm: fix atomic var operations if source isn't a deref |
| - radv: remove dead shader input/output variables |
| - radv: simplify a check in radv_fixup_vertex_input_fetches() |
| - radv/gfx10: fix the vertex order for triangle strips emitted by a GS |
| - gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh |
| - gitlab-ci: add a gl suffix to the x86 test image and all test jobs |
| - gitlab-ci: add a new job that builds a base test image for VK |
| - gitlab-ci: build cts_runner in the x86 test image for VK |
| - gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK |
| - gitlab-ci: add a new base test job for VK |
| - gitlab-ci: allow to run dEQP Vulkan with DEQP_VER |
| - gitlab-ci: configure the Vulkan ICD export with VK_DRIVER |
| - gitlab-ci: build RADV in meson-testing |
| - gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally |
| - radv: do not use VK_TRUE/VK_FALSE |
| - radv: move emission of two PA_SC\_\* registers to the pipeline CS |
| - radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rast |
| - radv: synchronize after performing a separate depth/stencil fast |
| clears |
| - radv: do not init HTILE as compressed state when dst layout allows it |
| - radv: initialize HTILE for separate depth/stencil aspects |
| - radv: implement VK_KHR_separate_depth_stencil_layouts |
| - gitlab-ci: set RADV_DEBUG=checkir for RADV test jobs |
| - ac/nir: fix out-of-bound access when loading constants from global |
| - radv: enable SpvCapabilityImageMSArray |
| - radv: handle unaligned vertex fetches on GFX6/GFX10 |
| - radv/gfx10: fix ngg_get_ordered_id |
| - radv/gfx10: fix the out-of-bounds check for vertex descriptors |
| - ac: declare an enum for the OOB select field on GFX10 |
| - radv: init a default multisample state for the resolve FS path |
| - radv: ignore pMultisampleState if rasterization is disabled |
| - radv: ignore pTessellationState if the pipeline doesn't use tess |
| - radv: ignore pDepthStencilState if rasterization is disabled |
| - radv: tidy up radv_pipeline_init_blend_state() |
| - radv: ignore pColorBlendState if rasterization is disabled |
| - radv: rely on pipeline layout when creating push descriptors with |
| template |
| - radv: return the correct pitch for linear mipmaps on GFX10 |
| - radv: record number of color/depth samples for each subpass |
| - radv: implement VK_AMD_mixed_attachment_samples |
| - ac/surface: use uint16_t for mipmap level pitches |
| - radv: do not fill keys from fragment shader twice |
| - spirv: add SpvCapabilityImageReadWriteLodAMD |
| - spirv,nir: add new lod parameter to image_{load,store} intrinsics |
| - amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod |
| - aco: handle nir_intrinsic_image_deref_{load,store} with lod |
| - radv: advertise VK_AMD_shader_image_load_store_lod |
| - radv/gfx10: disable vertex grouping |
| - radv/gfx10: determine if a pipeline is eligible for NGG passthrough |
| - radv/gfx10: do not declare LDS for NGG if useless |
| - radv/gfx10: add support for NGG passthrough mode |
| - radv/gfx10: improve performance for TES using PrimID but not |
| exporting it |
| - radv: only use VkSamplerCreateInfo::compareOp if enabled |
| - radv/gfx10: enable all CUs if NGG is never used |
| - radv/gfx10: simplify some duplicated NGG GS code |
| - vulkan/overlay: Fix for Vulkan 1.2 |
| - radv: update VK_EXT_descriptor_indexing for Vulkan 1.2 |
| - radv: update VK_EXT_host_query_reset for Vulkan 1.2 |
| - radv: update VK_EXT_sampler_filter_minmax for Vulkan 1.2 |
| - radv: update VK_EXT_scalar_block_layout for Vulkan 1.2 |
| - radv: update VK_KHR_8bit_storage for Vulkan 1.2 |
| - radv: update VK_KHR_buffer_device_address for Vulkan 1.2 |
| - radv: update VK_KHR_create_renderpass2 for Vulkan 1.2 |
| - radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2 |
| - radv: update VK_KHR_draw_indirect_count for Vulkan 1.2 |
| - radv: update VK_KHR_driver_properties for Vulkan 1.2 |
| - radv: update VK_KHR_image_format_list for Vulkan 1.2 |
| - radv: update VK_KHR_imageless_framebuffer for Vulkan 1.2 |
| - radv: update VK_KHR_shader_atomic_int64 for Vulkan 1.2 |
| - radv: update VK_KHR_shader_float16_int8 for Vulkan 1.2 |
| - radv: update VK_KHR_shader_float_controls for Vulkan 1.2 |
| - radv: update VK_KHR_shader_subgroup_extended_types for Vulkan 1.2 |
| - radv: update VK_KHR_uniform_buffer_standard_layout for Vulkan 1.2 |
| - radv: update VK_KHR_timeline_semaphore for Vulkan 1.2 |
| - radv: implement Vulkan 1.1 features and properties |
| - radv: implement Vulkan 1.2 features and properties |
| - radv: enable Vulkan 1.2 |
| - aco: fix emitting SMEM instructions with no operands on GFX6-GFX7 |
| - aco: do not select 96-bit/128-bit variants for ds_read/ds_write on |
| GFX6 |
| - aco: do not combine additions of DS instructions on GFX6 |
| - aco: implement stream output with vec3 on GFX6 |
| - aco: fix emitting slc for MUBUF instructions on GFX6-GFX7 |
| - aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the |
| system |
| - aco: fix constant folding of SMRD instructions on GFX6 |
| - aco: do not use the vec3 variant for stores on GFX6 |
| - aco: do not use the vec3 variant for loads on GFX6 |
| - aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7 |
| - aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6 |
| - radv: fix double free corruption in radv_alloc_memory() |
| - radv: add explicit external subpass dependencies to meta operations |
| - radv: handle missing implicit subpass dependencies |
| - spirv: add SpvCapabilityFragmentMaskAMD |
| - nir: add two new texture ops for multisample fragment color/mask |
| fetches |
| - spirv: add support for SpvOpFragment{Mask}FetchAMD operations |
| - nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch |
| - ac/nir: add support for nir_texop_fragment_{mask}_fetch |
| - aco: add support for nir_texop_fragment_{mask}_fetch |
| - radv: advertise VK_AMD_shader_fragment_mask |
| - aco: fix printing assembly with CLRXdisasm on GFX6 |
| - aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample |
| - aco: implement nir_intrinsic_store_global on GFX6 |
| - aco: implement nir_intrinsic_load_global on GFX6 |
| - aco: implement nir_intrinsic_global_atomic\_\* on GFX6 |
| - aco: implement 64-bit nir_op_ftrunc on GFX6 |
| - aco: implement 64-bit nir_op_fceil on GFX6 |
| - aco: implement 64-bit nir_op_fround_even on GFX6 |
| - aco: implement 64-bit nir_op_ffloor on GFX6 |
| - aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6 |
| - ac/llvm: fix missing casts in ac_build_readlane() |
| - aco: combine MRTZ (depth, stencil, sample mask) exports |
| - aco: fix a hardware bug for MRTZ exports on GFX6 |
| - aco: fix a hazard with v_interp\_\* and v_{read,readfirst}lane\_\* on |
| GFX6 |
| - aco: copy the literal offset of SMEM instructions to a temporary |
| - radv: enable ACO support for GFX6 |
| - radv: print NIR shaders after lowering FS inputs/outputs |
| - radv: do not allow sparse resources with multi-planar formats |
| - radv: enable VK_AMD_shader_fragment_mask on GFX6-GFX7 |
| - compiler: add a new explicit interpolation mode |
| - spirv: add support for SpvDecorationExplicitInterpAMD |
| - compiler: add PERSP to the existing barycentric system values |
| - compiler: add new SYSTEM_VALUE_BARYCENTRIC\_\* |
| - spirv: add support for SpvBuiltInBaryCoord\* |
| - nir: add nir_intrinsic_load_barycentric_model |
| - nir: lower SYSTEM_VALUE_BARYCENTRIC\_\* to nir_load_barycentric() |
| - nir: add nir_intrinsic_interp_deref_at_vertex |
| - nir: lower interp_deref_at_vertex to load_input_vertex |
| - spirv: implement SPV_AMD_shader_explicit_vertex_parameter |
| - ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter |
| - aco: implement VK_AMD_shader_explicit_vertex_parameter |
| - radv: gather which input PS variables use an explicit interpolation |
| mode |
| - radv: implement VK_AMD_shader_explicit_vertex_parameter |
| - radv: bump conformance version to 1.2.0.0 |
| - radv: remove the non conformant VK implementation warning on GFX10 |
| - aco: fix VS input loads with MUBUF on GFX6 |
| - radv/gfx10: add a separate flag for creating a GDS OA buffer |
| - radv/gfx10: implement NGG GS queries |
| - radv/gfx10: re-enable NGG GS |
| - radv: refactor physical device properties |
| - aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6 |
| - aco: do not use ds_{read,write}2 on GFX6 |
| - aco: fix waiting for scalar stores before "writing back" data on |
| GFX8-GFX9 |
| - aco: fix creating v_madak if v_mad_f32 has two sgpr literals |
| - nir: do not use De Morgan's Law rules for flt and fge |
| |
| Samuel Thibault (3): |
| |
| - loader: #define PATH_MAX when undefined (eg. Hurd) |
| - util: Do not fail to build on unknown pthread_setname_np |
| - meson: Do not require libdrm for DRI2 on hurd |
| |
| Satyajit Sahu (1): |
| |
| - radeon/vcn: Handle crop parameters for encoder |
| |
| Sonny Jiang (1): |
| |
| - radeonsi: use compute shader for clear 12-byte buffer |
| |
| Stephan Gerhold (1): |
| |
| - kmsro: Add "mcde" entry point |
| |
| Tapani Pälli (33): |
| |
| - nir: fix couple of compile warnings |
| - util/android: fix android build errors |
| - Revert "egl: implement new functions from |
| EGL_EXT_image_flush_external" |
| - Revert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT" |
| - Revert "st/dri: add support for EGL_EXT_image_flush_external" |
| - Revert "st/dri: assume external consumers of back buffers can write |
| to the buffers" |
| - Revert "dri_interface: add interface for |
| EGL_EXT_image_flush_external" |
| - mesa: allow bit queries for EXT_disjoint_timer_query |
| - Revert "mesa: allow bit queries for EXT_disjoint_timer_query" |
| - mesa: allow bit queries for EXT_disjoint_timer_query |
| - gitlab-ci: update Piglit commit, update skips |
| - mapi: add GetInteger64vEXT with EXT_disjoint_timer_query |
| - glsl: handle max uniform limits with lower_const_arrays_to_uniforms |
| - gitlab-ci: bump piglit checkout commit |
| - glsl: additional interface redeclaration check for SSO programs |
| - intel/compiler: add newline to limit_dispatch_width message |
| - intel/compiler: force simd8 when dual src blending on gen8 |
| - dri: add \__DRI_IMAGE_FORMAT_SXRGB8 |
| - i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual |
| - mesa/st/i965: add a ProgramResourceHash for quicker resource lookup |
| - mesa: create program resource hash in a single place |
| - iris: set depth stall enabled when depth flush enabled on gen12 |
| - anv: set depth stall enabled when depth flush enabled on gen12 |
| - isl/gen12: add reminder comment about missing WA with 3D surfaces |
| - anv: fix assert in GetImageDrmFormatModifierPropertiesEXT |
| - anv: add assert for isl_mod_info in choose_isl_tiling_flags |
| - anv: initialize clear_color_is_zero_one |
| - egl/android: fix buffer_count for applications setting max count |
| - anv/android: setup gralloc1 usage from gralloc0 usage manually |
| - anv/android: make format_supported_with_usage static |
| - intel/vec4: fix valgrind errors with vf_values array |
| - glsl: fix a memory leak with resource_set |
| - iris: fix aux buf map failure in 32bits app on Android |
| |
| Thomas Hellstrom (4): |
| |
| - winsys/svga: Enable transhuge pages for buffer objects |
| - svga: Avoid discard DMA uploads |
| - gallium/util: Increase the debug_flush map depth |
| - svga: Fix banded DMA upload |
| |
| Thong Thai (8): |
| |
| - st/va: Convert interlaced NV12 to progressive |
| - util/format: Add the P010 format used for 10-bit videos |
| - gallium: Add PIPE_FORMAT_P010 support |
| - st/va: Add support for P010, used for 10-bit videos |
| - radeon: Use P010 for decoding of 10-bit videos |
| - r600: Remove HEVC related code since HEVC is not supported |
| - mesa: Prevent \_MaxLevel from being less than zero |
| - Revert "st/va: Convert interlaced NV12 to progressive" |
| |
| Timothy Arceri (66): |
| |
| - glsl: just use NIR to lower outputs when driver can't read outputs |
| - glsl: disable lower_fragdata_array() for NIR drivers |
| - mesa: add ARB_shading_language_include stubs |
| - glsl: add infrastructure for ARB_shading_language_include |
| - mesa: add ARB_shading_language_include infrastructure to |
| gl_shared_state |
| - mesa: add helper to validate tokenise shader include path |
| - mesa: add \_mesa_lookup_shader_include() helper |
| - mesa: add copy_string() helper |
| - mesa: add glNamedStringARB() support |
| - mesa: implement glGetNamedStringARB() |
| - mesa: make error checking optional in \_mesa_lookup_shader_include() |
| - mesa: implement glIsNamedStringARB() |
| - mesa: implement glGetNamedStringivARB() |
| - mesa: split \_mesa_lookup_shader_include() in two |
| - mesa: implement glDeleteNamedStringARB() |
| - glsl: add ARB_shading_language_include support to #line |
| - glsl: pass gl_context to glcpp_parser_create() |
| - glsl: add preprocessor #include support |
| - glsl: error if #include used while extension is disabled |
| - glsl: add can_skip_compile() helper |
| - glsl: delay compilation skip if shader contains an include |
| - mesa: add support cursor support for relative path shader includes |
| - mesa: add shader include lookup support for relative paths |
| - mesa: implement glCompileShaderIncludeARB() |
| - mesa: enable ARB_shading_language_include |
| - gitlab-ci: bump piglit checkout commit |
| - gitlab-ci: update for arb_shading_language_include |
| - compiler: move build definition of pp_standalone_scaffolding.c |
| - radv: add some infrastructure for fresh forks for each secure compile |
| - radv: add a secure_compile_open_fifo_fds() helper |
| - radv: create a fresh fork for each pipeline compile |
| - docs: update source code repository documentation |
| - glsl: move calculate_array_size_and_stride() to link_uniforms.cpp |
| - glsl: don't set uniform block as used when its not |
| - glsl: make use of active_shader_mask when building resource list |
| - glsl/nir: iterate the system values list when adding varyings |
| - docs: remove mailing list as way of submitting patches |
| - glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir() |
| - glsl: copy the how_declared field when converting to nir |
| - nir: add some fields to nir_variable_data |
| - glsl: copy the new data fields when converting to nir |
| - glsl: add support for named varyings in |
| nir_build_program_resource_list() |
| - glsl: add subroutine support to nir_build_program_resource_list() |
| - st/glsl_to_nir: call gl_nir_lower_buffers() a little later |
| - st/glsl_to_nir: use nir based program resource list builder |
| - st/glsl_to_nir: fix SSO validation regression |
| - glsl: rename gl_nir_link() to gl_nir_link_spirv() |
| - glsl: add gl_nir_link_check_atomic_counter_resources() |
| - glsl: add new gl_nir_link_glsl() helper |
| - glsl: reorder link_and_validate_uniforms() calls |
| - mesa: add new UseNIRGLSLLinker constant |
| - glsl: use nir linker to link atomics |
| - glsl: add check_image_resources() for the nir linker |
| - glsl: use nir version of check_image_resources() for nir linker |
| - glsl: move check_subroutine_resources() into the shared util code |
| - glsl: call check_subroutine_resources() from the nir linker |
| - glsl: move uniform resource checks into the common linker code |
| - glsl: call uniform resource checks from the nir linker |
| - glsl: move calculate_subroutine_compat() to shared linker code |
| - glsl: call calculate_subroutine_compat() from the nir linker |
| - glsl: fix potential bug in nir uniform linker |
| - glsl: remove bogus assert in nir uniform linking |
| - glsl: fix check for matrices in blocks when using nir uniform linker |
| - glsl: count uniform components and storage better in nir linking |
| - glsl_to_nir: update interface type properly |
| - glsl: fix gl_nir_set_uniform_initializers() for image arrays |
| |
| Timur Kristóf (39): |
| |
| - ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format. |
| - aco: Make sure not to mistakenly propagate 64-bit constants. |
| - aco: Treat all booleans as per-lane. |
| - aco: Optimize out trivial code from uniform bools. |
| - aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce. |
| - aco: Remove superfluous argument from emit_boolean_logic. |
| - aco: Remove lower_linear_bool_phi, it is not needed anymore. |
| - aco: Optimize load_subgroup_id to one bit field extract instruction. |
| - aco/wave32: Change uniform bool optimization to work with wave32. |
| - aco/wave32: Replace hardcoded numbers in spiller with wave size. |
| - aco/wave32: Introduce emit_mbcnt which takes wave size into account. |
| - aco/wave32: Add wave size specific opcodes to aco_builder. |
| - aco/wave32: Use lane mask regclass for exec/vcc. |
| - aco/wave32: Fix load_local_invocation_index to support wave32. |
| - aco/wave32: Use wave_size for barrier intrinsic. |
| - aco/wave32: Allow setting the subgroup ballot size to 64-bit. |
| - aco/wave32: Fix reductions. |
| - aco: Fix uniform i2i64. |
| - ac/llvm: Fix ac_build_reduce in wave32 mode. |
| - aco/wave32: Set the definitions of v_cmp instructions to the lane |
| mask. |
| - aco: Implement 64-bit constant propagation. |
| - aco: Allow optimizing vote_all and nir_op_iand. |
| - aco: Don't skip combine_instruction when definitions[1] is used. |
| - aco: Optimize out s_and with exec, when used on uniform bitwise |
| values. |
| - aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible. |
| - nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl. |
| - intel/compiler: Fix array bounds warning on GCC 10. |
| - radeon: Move si_get_pic_param to radeon_vce.c |
| - r600: Move get_pic_param to radeon_vce.c |
| - gallium: Fix a couple of multiple definition warnings. |
| - radeon: Fix multiple definition error with radeon_debug |
| - aco: Fix -Wstringop-overflow warnings in aco_span. |
| - aco: Fix maybe-uninitialized warnings. |
| - aco: Fix signedness compare warning. |
| - aco: Make a better guess at which instructions need the VCC hint. |
| - aco: Transform uniform bitwise instructions to 32-bit if possible. |
| - aco/gfx10: Fix VcmpxExecWARHazard mitigation. |
| - aco: Fix the meaning of is_atomic. |
| - aco/optimizer: Don't combine uniform bool s_and to s_andn2. |
| |
| Tomasz Pyra (1): |
| |
| - gallium/swr: Fix arb_transform_feedback2 |
| |
| Tomeu Vizoso (38): |
| |
| - gitlab-ci: Disable lima jobs |
| - gitlab-ci: Run only LAVA jobs in special-named branches |
| - panfrost: Add checksum fields to SFBD descriptor |
| - panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720 |
| - panfrost: Rework format encoding on SFBD |
| - panfrost: Take into account texture layers in SFBD |
| - panfrost: Decode blend shaders for SFBD |
| - panfrost: Generate polygon list manually for SFBD |
| - panfrost: Print the right zero field |
| - panfrost: Pipe the GPU ID into compiler and disassembler |
| - panfrost: Set depth and stencil for SFBD based on the format |
| - panfrost: Multiply offset_units by 2 |
| - panfrost: Make sure the shader descriptor is in sync with the GL |
| state |
| - gitlab-ci: Remove limit on kernel logging |
| - panfrost: Just print tiler fields as-is for Tx20 |
| - panfrost: Rework buffers in SFBD |
| - gitlab-ci: Fix dir name for VK-GL-CTS sources |
| - panfrost: Don't print the midgard_blend_rt structs on SFBD |
| - panfrost: Add quirks system to cmdstream |
| - panfrost: Simplify shader patching |
| - panfrost: White list the Mali T720 |
| - gitlab-ci: Test Panfrost on T720 GPUs |
| - panfrost: Add PAN_MESA_DEBUG=sync |
| - panfrost: Hold a reference to sampler views |
| - pan/midgard: Remove undefined behavior |
| - nir: Don't copy empty array |
| - util: Don't access members of NULL pointers |
| - panfrost: Don't lose bits! |
| - st/mesa: Don't access members of NULL pointers |
| - panfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM |
| - panfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16 |
| - panfrost: Dynamically allocate array of texture pointers |
| - panfrost: Map with size of first layer for 3D textures |
| - panfrost: Store internal format |
| - gitlab-ci: Update kernel for LAVA to 5.5-rc1 plus fixes |
| - gitlab-ci: Switch LAVA jobs to use shared dEQP runner |
| - gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5 |
| - gitlab-ci: Consolidate container and build stages for LAVA |
| |
| Urja Rannikko (4): |
| |
| - panfrost: free last_read/write tables in mir_create_dependency_graph |
| - panfrost: free allocations in schedule_block |
| - panfrost: add lcra_free() to free lcra state |
| - panfrost: free spill cost table in mir_spill_register |
| |
| Vasily Khoruzhick (31): |
| |
| - lima: add debug prints for BO cache |
| - lima: align size before trying to fetch BO from cache |
| - lima: ignore flags while looking for BO in cache |
| - lima: set dithering flag when necessary |
| - lima: add support for gl_PointSize |
| - lima: enable tiling |
| - lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle() |
| - lima: expose tiled format modifier in query_dmabuf_modifiers() |
| - lima: use single BO for GP outputs |
| - lima: drop suballocator |
| - lima: fix allocation of GP outputs storage for indexed draw |
| - lima: postpone PP stream generation |
| - lima: don't reload and redraw tiles that were not updated |
| - lima: fix PP stream terminator size |
| - lima: use linear layout for shared buffers if modifier is not |
| specified |
| - lima: add debug flag to disable tiling |
| - lima: drop support for R8G8B8 format |
| - lima: fix PLBU_CMD_PRIMITIVE_SETUP command |
| - lima: fix viewport clipping |
| - lima: implement polygon offset |
| - lima: fix PIPE_CAP\_\* to mark features that aren't supported yet |
| - lima: add new findings to texture descriptor |
| - lima: fix handling of reverse depth range |
| - ci: lava: pass CI_NODE_INDEX and CI_NODE_TOTAL to lava jobs |
| - ci: Re-enable CI for lima on mali450 |
| - lima: implement invalidate_resource() |
| - nir: don't emit ishl in \_nir_mul_imm() if backend doesn't support |
| bitops |
| - lima: use imul for calculations with intrinsic src |
| - lima: ppir: don't delete root ld_tex nodes without successors in |
| current block |
| - lima: ppir: always create move and update ld_tex successors for all |
| blocks |
| - lima: disable early-z if fragment shader uses discard |
| |
| Vinson Lee (9): |
| |
| - swr: Fix build with llvm-10.0. |
| - panfrost: Fix gnu-empty-initializer build errors. |
| - scons: Bump C standard to gnu11 on macOS 10.15. |
| - util/u_thread: Restrict u_thread_get_time_nano on macOS. |
| - swr: Fix build with llvm-10.0. |
| - swr: Fix build with llvm-10.0. |
| - lima: Fix build with GCC 10. |
| - swr: Fix GCC 4.9 checks. |
| - panfrost: Remove unused anonymous enum variables. |
| |
| Wladimir J. van der Laan (2): |
| |
| - u_vbuf: add logic to use a limited number of vbufs |
| - u_vbuf: use single vertex buffer if it's not possible to have |
| multiple |
| |
| X512 (1): |
| |
| - util/u_thread: Fix build under Haiku |
| |
| Yevhenii Kolesnikov (5): |
| |
| - glsl: Enable textureSize for samplerExternalOES |
| - meson: Fix linkage of libgallium_nine with libgalliumvl |
| - meta: Cleanup function for DrawTex |
| - main: allow external textures for BindImageTexture |
| - meta: Add cleanup function for Bitmap |
| |
| Zebediah Figura (1): |
| |
| - Revert "draw: revert using correct order for prim decomposition." |
| |
| luc (1): |
| |
| - zink: confused compilation macro usage for zink in target helpers. |