[Git][xorg-team/lib/mesa][upstream-experimental] 48 commits: i965: Fix BRW_MEMZONE_LOW_4G heap size.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Git][xorg-team/lib/mesa][upstream-experimental] 48 commits: i965: Fix BRW_MEMZONE_LOW_4G heap size.

Timo Aaltonen-4
GitLab

Timo Aaltonen pushed to branch upstream-experimental at X Strike Force / lib / mesa

Commits:

  • fd27561c
    by Kenneth Graunke at 2019-05-08T10:26:08Z
    i965: Fix BRW_MEMZONE_LOW_4G heap size.
    
    
    
    The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages,
    
    and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB.
    
    
    
    So we can't use the top page.
    
    
    
    Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 15f134c62853ed6679435a9e4ae40e3308fc7453)
    
    
  • faa7daa5
    by Kenneth Graunke at 2019-05-08T10:27:20Z
    i965: Force VMA alignment to be a multiple of the page size.
    
    
    
    This should happen regardless, but let's be paranoid.
    
    
    
    Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 17210c63a91aaf018813b0d336f5f1d4fd87eafb)
    
    
  • f770e81b
    by Kenneth Graunke at 2019-05-08T10:28:39Z
    i965: leave the top 4Gb of the high heap VMA unused
    
    
    
    This ports commit 9e7b0988d6e98690eb8902e477b51713a6ef9cae from anv
    
    to i965.  Thanks to Lionel for noticing that it was missing!
    
    
    
    Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit d568fcd0a09751cd041cdd46bc585e209e0df394)
    
    
  • 9d610c1c
    by Timothy Arceri at 2019-05-08T10:32:39Z
    Revert "glx: Fix synthetic error generation in __glXSendError"
    
    
    
    This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878.
    
    
    
    This seems to have broken a number of wine games. Lets revert
    
    everything for now and try again later.
    
    
    
    Acked-by: Adam Jackson <[hidden email]>
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590
    
    (cherry picked from commit a01b393c397c846345f03f76f1167dd667e0ee96)
    
    
  • 424b60dc
    by Lionel Landwerlin at 2019-05-09T10:38:10Z
    anv: rework queries writes to ensure ordering memory writes
    
    
    
    We use a mix of MI & PIPE_CONTROL commands to write our queries' data
    
    (results & availability). Those commands' memory write order is not
    
    guaranteed with regard to their order in the command stream, unless CS
    
    stalls are inserted between them. This is problematic for 2 reasons :
    
    
    
       1. We copy results from the device using MI commands even though
    
          the values are generated from PIPE_CONTROL, meaning we could
    
          copy unlanded values into the results and then copy the
    
          availability that is inconsistent with the values.
    
    
    
       2. We allow the user to poll on the availability values of the
    
          query pool from the CPU. If the availability lands in memory
    
          before the values then we could return invalid values.
    
    
    
    This change does 2 things to address this problem :
    
    
    
          - We use either PIPE_CONTROL or MI commands to write both
    
            queries values and availability, so that the ordering of the
    
            memory writes guarantees that if availability is visible,
    
            results are also visible.
    
    
    
          - For the occlusion & timestamp queries we apply a CS stall
    
            before copying the results on the device, to ensure copying
    
            with MI commands see the correct values of previous
    
            PIPE_CONTROL writes of availability (required by the Vulkan
    
            spec).
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Reported-by: Iago Toral Quiroga <[hidden email]>
    
    Cc: [hidden email]
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit a07d06f10352fc5fa40db8a723fa5842ebc660db)
    
    
  • d95797de
    by Lionel Landwerlin at 2019-05-09T10:39:19Z
    anv: fix use after free
    
    
    
    Once mem->bo is removed from the cache, it is likely to be freed.
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Fixes: b80930a6fea075 ("anv: add support for VK_EXT_memory_budget")
    
    Reviewed-by: Eric Engestrom <[hidden email]>
    
    (cherry picked from commit 43596e5f343e6f6dc9a81e36701324f79390cff3)
    
    
  • 4a7b0cc5
    by Dylan Baker at 2019-05-09T10:40:22Z
    meson: Force the use of config-tool for llvm
    
    
    
    meson git now has a cmake find method for llvm, but it lacks a couple of
    
    features that we use from the config tool version. Until that reaches
    
    parity we need to use the config-tool version.
    
    
    
    CC: 19.0 19.1 <<[hidden email]>
    
    Reviewed-by: Eric Engestrom <[hidden email]>
    
    (cherry picked from commit 0d59459432cf077d768164091318af8fb1612500)
    
    
  • 5d7d13d2
    by Dave Airlie at 2019-05-09T10:43:03Z
    kmsro: add _dri.so to two of the kmsro drivers.
    
    
    
    Fixes: 8cfc17bdda3 (kmsro: Add the rest of the current set of tinydrm drivers.)
    
    
    
    Reviewed-by: Eric Engestrom <[hidden email]>
    
    (cherry picked from commit 0a42d5b98bc3083e20475eb1ecea20f9b876269d)
    
    
  • a97f44ac
    by Samuel Pitoiset at 2019-05-09T10:44:18Z
    radv: fix setting the number of rectangles when it's dyanmic
    
    
    
    We need to know the number of rectangles.
    
    
    
    This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*.
    
    
    
    Fixes: 5db0bf99944 ("radv: Implement VK_EXT_discard_rectangles.")
    
    Signed-off-by: Samuel Pitoiset <[hidden email]>
    
    Reviewed-by: Bas Nieuwenhuizen <[hidden email]>
    
    (cherry picked from commit 53dfff1c4d95ee21661d86256f44eae26b985b50)
    
    
  • e0c082d6
    by Lionel Landwerlin at 2019-05-10T16:58:53Z
    anv: Use corresponding type from the vector allocation
    
    
    
    We didn't notice this issue much because the 2 struct share a similar
    
    layout, expect for the additional fields...
    
    
    
    We run into that issue in Anv :
    
    
    
    ==15236== Invalid write of size 8
    
    ==15236==    at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211)
    
    ==15236==    by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264)
    
    ==15236==    by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312)
    
    ==15236==    by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167)
    
    ==15236==    by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190)
    
    ==15236==    by 0x8CF60871: alloc_surface_state (anv_image.c:1122)
    
    ==15236==    by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519)
    
    ==15236==    by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358)
    
    ==15236==  Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd
    
    ==15236==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    
    ==15236==    by 0x8D2578E6: u_vector_init (u_vector.c:47)
    
    ==15236==    by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168)
    
    ==15236==    by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921)
    
    ==15236==    by 0x8CF56517: anv_CreateDevice (anv_device.c:1909)
    
    ==15236==    by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073)
    
    ==15236==    by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so)
    
    ==15236==    by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so)
    
    ==15236==    by 0x8BCB35C6: loader_create_device_chain (loader.c:5449)
    
    ==15236==    by 0x8BCBC230: vkCreateDevice (trampoline.c:838)
    
    
    
    v2: Rename mmap_cleanups to avoid confusion (Caio)
    
    
    
    v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio)
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648
    
    Cc: <[hidden email]>
    
    Reviewed-by: Caio Marcelo de Oliveira Filho <[hidden email]>
    
    (cherry picked from commit f2f6ac1c0811858374142022f81bdcf0207e640c)
    
    
  • f1ab2220
    by Rob Clark at 2019-05-10T17:00:35Z
    freedreno/ir3: fix rasterflat/glxgears
    
    
    
    Ofc legacy gl features that are broken don't trigger fails in deqp.  I
    
    should remember to test glxgears more often.
    
    
    
    Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs
    
    Signed-off-by: Rob Clark <[hidden email]>
    
    (cherry picked from commit 9faf218b8cdda81b5813e935d5ba6e0d57706a03)
    
    
  • 5e758033
    by Tomeu Vizoso at 2019-05-10T17:02:37Z
    panfrost: Fix two uninitialized accesses in compiler
    
    
    
    Valgrind was complaining of those.
    
    
    
    NIR_PASS only sets progress to TRUE if there was progress.
    
    
    
    nir_const_load_to_arr() only sets as many constants as components has
    
    the instruction.
    
    
    
    This was causing some dEQP tests to flip-flop, such as:
    
    
    
    dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color
    
    
    
    Signed-off-by: Tomeu Vizoso <[hidden email]>
    
    Reviewed-by: Alyssa Rosenzweig <[hidden email]>
    
    Fixes: 14531d676b11 ("nir: make nir_const_value scalar")
    
    (cherry picked from commit 554975bafa4ec17e19d90725c66feeb4e1f49d9e)
    
    
  • f8ec40e2
    by Tomeu Vizoso at 2019-05-10T17:04:58Z
    panfrost: Only take the fast paths on buffers aligned to block size
    
    
    
    As the functions operate on 16-byte blocks.
    
    
    
    Fixes this Valgrind error:
    
    
    
    Invalid read of size 4
    
       at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85)
    
       by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171)
    
       by 0x584F587: panfrost_tile_texture (pan_resource.c:489)
    
       by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525)
    
       by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516)
    
       by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515)
    
       by 0x5875F13: u_default_texture_subdata (u_transfer.c:80)
    
       by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
    
       by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
    
       by 0x5391353: teximage (teximage.c:3105)
    
       by 0x5391353: teximage_err (teximage.c:3132)
    
       by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
    
       by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
    
     Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd
    
       at 0x483F5C8: malloc (vg_replace_malloc.c:299)
    
       by 0x584F47D: panfrost_transfer_map (pan_resource.c:467)
    
       by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
    
       by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59)
    
       by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
    
       by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
    
       by 0x5391353: teximage (teximage.c:3105)
    
       by 0x5391353: teximage_err (teximage.c:3132)
    
       by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
    
       by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
    
       by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2)
    
    
    
    Signed-off-by: Tomeu Vizoso <[hidden email]>
    
    Reviewed-by: Emil Velikov <[hidden email]>
    
    Cc: 19.1 <[hidden email]>
    
    (cherry picked from commit c3538ab5702ceeead284c2b5f9e700f3082c8135)
    
    
  • 349153f0
    by Leo Liu at 2019-05-10T17:06:21Z
    winsys/amdgpu: add VCN JPEG to no user fence group
    
    
    
    There is no user fence for JPEG, the bug triggering
    
    kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT)
    
    
    
    Signed-off-by: Leo Liu <[hidden email]>
    
    Acked-by: Christian König <[hidden email]>
    
    Reviewed-by: Bas Nieuwenhuizen <[hidden email]>
    
    Cc: [hidden email]
    
    (cherry picked from commit ceba9ff2948d7efa57d7035c7717f046303e7c64)
    
    
  • 1fc65774
    by Eric Engestrom at 2019-05-13T10:30:30Z
    travis: fix syntax, and drop unused stuff
    
    
    
    Fixes: a988d953899c099719f3 "ci: Delete autotools build jobs"
    
    Signed-off-by: Eric Engestrom <[hidden email]>
    
    (cherry picked from commit 6e5728e5c92b6d006862aae24763c3ce32ef20a6)
    
    
  • f0e147bd
    by Kenneth Graunke at 2019-05-13T10:31:35Z
    i965: Fix memory leaks in brw_upload_cs_work_groups_surface().
    
    
    
    This was taking a reference to the 64kB upload buffer and never
    
    returning it, leaking a reference each time this atom triggered.
    
    
    
    This leaked lots of 64kB upload BOs, eventually running us out of
    
    of VMA space.  This would usually happen when using mpv to watch a
    
    movie, after 20-40 minutes.
    
    
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134
    
    Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups
    
    Reviewed-by: Caio Marcelo de Oliveira Filho <[hidden email]>
    
    (cherry picked from commit 3f60810de0a2960ec15118ef9888d9efc9ea605a)
    
    
  • 87722e0c
    by Lionel Landwerlin at 2019-05-13T10:37:08Z
    vulkan/overlay: keep allocating draw data until it can be reused
    
    
    
    The original implementation assumed that we could allocate the same
    
    amount of command buffers as the number of images in the swapchain.
    
    But the application could potentially render much faster and rerender
    
    into images that have been submitted for presentation but not yet
    
    presented.
    
    
    
    This change keeps on allocating command buffers, vertex buffer, vertex
    
    indices as well as a semaphore and a fence for as long as we can't
    
    reuse a previously submitted one.
    
    
    
    This fixes rendering issues in the overlay at high frame rates.
    
    
    
    v2: Don't recreate semaphores constantly (Józef)
    
    
    
    v3: Drop useless surface & FreeCommandBuffers (Józef)
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655
    
    Cc: 19.1 <[hidden email]>
    
    Reviewed-by: Józef Kucia <[hidden email]>
    
    (cherry picked from commit ad2b4aa37806779bdfc15d704940136c3db21eb4)
    
    [Juan: resolve trivial conflicts]
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    
    
    Conflicts:
    
    	src/vulkan/overlay-layer/overlay.cpp
    
    
  • 38fdfdaf
    by Caio Marcelo de Oliveira Filho at 2019-05-13T10:40:03Z
    anv: Fix limits when VK_EXT_descriptor_indexing is used
    
    
    
    Update various limits in
    
    VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously
    
    zero to their values from VkPhysicalDeviceLimits.  When using
    
    VK_EXT_descriptor_indexing, the former limits will apply to all the
    
    descriptor layout sets -- not only those using the new feature bits.
    
    
    
    For the reference, VK_EXT_descriptor_indexing says
    
    
    
        "There are new descriptor set layout and descriptor pool creation
    
        flags that are required to opt in to the update-after-bind
    
        functionality, and there are separate maxPerStage* and
    
        maxDescriptorSet* limits that apply to these descriptor set
    
        layouts which may be much higher than the pre-existing limits. The
    
        old limits only count descriptors in non-updateAfterBind
    
        descriptor set layouts, and the new limits count descriptors in
    
        all descriptor set layouts in the pipeline layout."
    
    
    
    Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 3610081daa47009aef23a7ab4471e7a71a073127)
    
    
  • f7c0ca6d
    by Kenneth Graunke at 2019-05-13T10:41:16Z
    iris: Use full ways for L3 cache setup on Icelake.
    
    
    
    Anuj fixed this in i965 and anv, but the fix never landed in iris.
    
    Fixes tessellation corruption on Icelake.  Thanks to Rafael for
    
    bisecting this and tracking it down.
    
    
    
    Fixes: d0996d5fab6 iris: Emit default L3 config for the render pipeline
    
    Reviewed-by: Rafael Antognolli <[hidden email]>
    
    (cherry picked from commit 72ccefb5298203c6e1c4b40b60b5dd356900ad47)
    
    
  • bb845df9
    by Marek Olšák at 2019-05-13T10:44:06Z
    st/mesa: fix 2 crashes in st_tgsi_lower_yuv
    
    
    
    src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct
    
     tgsi_full_dst_register *, const struct tgsi_full_dst_register *, unsigned
    
     int): assertion "dst->Register.WriteMask" failed
    
    
    
    The second crash was due to insufficient allocated size for TGSI
    
    instructions.
    
    
    
    Cc: 19.0 19.1 <[hidden email]>
    
    Reviewed-by: Rob Clark <[hidden email]>
    
    (cherry picked from commit 83435e748f7c2c6bf1c946f2a489cce40b9ea05f)
    
    
  • e2654c23
    by Józef Kucia at 2019-05-13T10:45:10Z
    radv: clear vertex bindings while resetting command buffer
    
    
    
    Only vertex inputs accessed by vertex shader must have valid buffers
    
    bound.
    
    
    
    Signed-off-by: Józef Kucia <[hidden email]>
    
    Reviewed-by: Bas Nieuwenhuizen <[hidden email]>
    
    Fixes: 5010436e09f "radv: bail out when binding the same vertex buffers"
    
    (cherry picked from commit 24af0f1318967e20a8c5d7f3559389c341a0a11c)
    
    
  • 914ac06e
    by Bas Nieuwenhuizen at 2019-05-13T10:47:26Z
    radv: Do not use extra descriptor space for the 3rd plane.
    
    
    
    While ImageFormatProperties returns the number of internal descriptors,
    
    it turns out that applications do not need to actually allocate more
    
    descriptors in the descriptor pool.
    
    
    
    So if we make descriptors with more planes larger we have to be
    
    convervative and always allocate space for the larger descriptors
    
    which is a waste given the low usage of this ext.
    
    
    
    So let us make use of the fact that 3plane formats all have the
    
    same formats & dimensions for the last two planes. This way we
    
    only need the first half of the descriptor of the 3rd plane and
    
    can share the second half of the second plane.
    
    
    
    This allows us to use 16 bytes for the descriptor which nicely
    
    fits into the 16 bytes that are unused right next to the sampler.
    
    
    
    Fixes: 5564c38212a "radv: Update descriptor sets for multiple planes."
    
    Reviewed-by: Samuel Pitoiset <[hidden email]>
    
    (cherry picked from commit f53ebfb4503a1ae054539df1c414b86c3b1966d9)
    
    
  • 9b51dcf1
    by Gert Wollny at 2019-05-14T08:41:50Z
    softpipe/buffer: load only as many components as the the buffer resource type provides
    
    
    
    Otherwise we risk to read past the end of the buffer.
    
    
    
    In addition, change the loop counters to unsigned to be consistent
    
    with the types.
    
    
    
    Fixes: afa8707ba93a7d226a76319acda2a8dd89524db7
    
        softpipe: add SSBO/shader atomics support.
    
    
    
    Signed-off-by: Gert Wollny <[hidden email]>
    
    Reviewed-by: Dave Airlie <[hidden email]>
    
    (cherry picked from commit 865b9ddae4874186182e529b5fd154ab04a61f79)
    
    
  • c03d9a7f
    by Juan A. Suarez Romero at 2019-05-14T13:36:06Z
    Update version to 19.1.0-rc2
    
    
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    
  • 8cf49e16
    by Jason Ekstrand at 2019-05-15T08:26:52Z
    intel/fs/ra: Only add dest interference to sources that exist
    
    
    
    Fixes: 83dedb6354d "i965: Add src/dst interference for certain"
    
    Reviewed-by: Kenneth Graunke <[hidden email]>
    
    (cherry picked from commit 88cac12230807456824d1f86f990a3926371a198)
    
    
  • 75ea0eee
    by Jason Ekstrand at 2019-05-15T08:28:06Z
    intel/fs/ra: Stop adding RA interference to too many SENDS nodes
    
    
    
    We only have one node per VGRF so this was adding way too much
    
    interference.  No idea how we didn't catch this before.
    
    
    
    Shader-db results on Kaby Lake:
    
    
    
        total instructions in shared programs: 15311100 -> 15311100 (0.00%)
    
        instructions in affected programs: 0 -> 0
    
        helped: 0
    
        HURT: 0
    
    
    
        total cycles in shared programs: 355468050 -> 355543197 (0.02%)
    
        cycles in affected programs: 2472492 -> 2547639 (3.04%)
    
        helped: 17
    
        HURT: 20
    
    
    
    Fixes: 014edff0d20d "intel/fs: Add interference between SENDS sources"
    
    Reviewed-by: Kenneth Graunke <[hidden email]>
    
    (cherry picked from commit 096ad8a8099cbcb3c868c08814fbe14ac79ca680)
    
    
  • 06bf5428
    by Ian Romanick at 2019-05-15T08:36:12Z
    Revert "nir: add late opt to turn inot/b2f combos back to bcsel"
    
    
    
    This reverts commit 7acc8652268205a266068ea4d059eccce43e1f78.
    
    
    
    With these optimizations in place, the extra constant folding added in
    
    the next commit extends some live ranges of 0.0 and ±1.0 constants, and
    
    that causes several hundred shaders to have more spills and fills.
    
    
    
    I believe this optimization we made basically irrelevant by 7725d609387
    
    "intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))".
    
    
    
    All Gen7.5+ platforms had similar results. (Ice Lake shown)
    
    total instructions in shared programs: 17225303 -> 17224634 (<.01%)
    
    instructions in affected programs: 879402 -> 878733 (-0.08%)
    
    helped: 679
    
    HURT: 1
    
    helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
    
    helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05%
    
    HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
    
    HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
    
    95% mean confidence interval for instructions value: -1.02 -0.95
    
    95% mean confidence interval for instructions %-change: -0.26% -0.22%
    
    Instructions are helped.
    
    
    
    total cycles in shared programs: 360842595 -> 360828542 (<.01%)
    
    cycles in affected programs: 110443594 -> 110429541 (-0.01%)
    
    helped: 389
    
    HURT: 265
    
    helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28
    
    helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11%
    
    HURT stats (abs)   min: 1 max: 7614 x̄: 185.96 x̃: 48
    
    HURT stats (rel)   min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10%
    
    95% mean confidence interval for cycles value: -75.65 32.67
    
    95% mean confidence interval for cycles %-change: -0.49% -0.06%
    
    Inconclusive result (value mean confidence interval includes 0).
    
    
    
    total spills in shared programs: 12159 -> 12161 (0.02%)
    
    spills in affected programs: 13 -> 15 (15.38%)
    
    helped: 0
    
    HURT: 1
    
    
    
    total fills in shared programs: 25207 -> 25208 (<.01%)
    
    fills in affected programs: 25 -> 26 (4.00%)
    
    helped: 0
    
    HURT: 1
    
    
    
    Ivy Bridge
    
    total instructions in shared programs: 12082019 -> 12082013 (<.01%)
    
    instructions in affected programs: 1033 -> 1027 (-0.58%)
    
    helped: 6
    
    HURT: 0
    
    helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
    
    helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59%
    
    95% mean confidence interval for instructions value: -1.00 -1.00
    
    95% mean confidence interval for instructions %-change: -0.78% -0.45%
    
    Instructions are helped.
    
    
    
    total cycles in shared programs: 179849270 -> 179849157 (<.01%)
    
    cycles in affected programs: 4735 -> 4622 (-2.39%)
    
    helped: 4
    
    HURT: 0
    
    helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18
    
    helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36%
    
    95% mean confidence interval for cycles value: -82.73 26.23
    
    95% mean confidence interval for cycles %-change: -7.98% 2.28%
    
    Inconclusive result (value mean confidence interval includes 0).
    
    
    
    Sandy Bridge
    
    total instructions in shared programs: 10882750 -> 10882748 (<.01%)
    
    instructions in affected programs: 266 -> 264 (-0.75%)
    
    helped: 2
    
    HURT: 0
    
    
    
    Iron Lake
    
    total cycles in shared programs: 188609440 -> 188609448 (<.01%)
    
    cycles in affected programs: 4320 -> 4328 (0.19%)
    
    helped: 0
    
    HURT: 2
    
    
    
    GM45
    
    total cycles in shared programs: 129016868 -> 129016872 (<.01%)
    
    cycles in affected programs: 2302 -> 2306 (0.17%)
    
    helped: 0
    
    HURT: 1
    
    
    
    Reviewed-by: Matt Turner <[hidden email]>
    
    (cherry picked from commit d2a9ba03e30602f040687da325470d72eeddef1a)
    
    [Juan: resolve trivial conflicts]
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    
    
    Conflicts:
    
    	src/compiler/nir/nir_opt_algebraic.py
    
    
  • 51354d2b
    by Lionel Landwerlin at 2019-05-16T07:34:09Z
    nir: fix lower_non_uniform_access pass
    
    
    
    Obviously missing the instruction insertion into the SSA list.
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access")
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 391a836e8fb1c84170f3aa7550f0b347d31528f3)
    
    
  • 558a067d
    by Lionel Landwerlin at 2019-05-16T07:36:57Z
    vulkan/overlay-layer: fix cast errors
    
    
    
    Not quite sure what version of GCC/Clang produces errors (8.3.0
    
    locally was fine).
    
    
    
    v2: also fix an integer literal issue (Karol)
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Reviewed-by: Tapani Pälli <[hidden email]> (v1)
    
    Reviewed-by: Eric Engestrom <[hidden email]>
    
    (cherry picked from commit 2d2927938f074f402cab28aa5322567a76cbde58)
    
    
  • d70d8b2f
    by Lionel Landwerlin at 2019-05-16T07:40:47Z
    vulkan/overlay: fix truncating error on 32bit platforms
    
    
    
    Non dispatchable handles can be uint64_t. When compiling the layer on
    
    a 32bit platform, this will lead to casting uint64_t into (void *)
    
    which is 32bit, leading to incorrect handles being mapped internally
    
    in the layer.
    
    
    
    v2: Use more HKEY() (Eric)
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Reported-by: Józef Kucia <[hidden email]>
    
    Fixes: 2d2927938f074f ("vulkan/overlay-layer: fix cast errors")
    
    Reviewed-by: Józef Kucia <[hidden email]>
    
    (cherry picked from commit 877b371cbb2c51cd569d8e5bb3f00ef6d9724336)
    
    [Juan: resolve trivial conflicts]
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    
    
    Conflicts:
    
    	src/vulkan/overlay-layer/overlay.cpp
    
    
  • 5fcfcdb1
    by Lionel Landwerlin at 2019-05-16T17:18:34Z
    nir: lower_non_uniform_access: iterate over instructions safely
    
    
    
    This pass moves instructions around and adds control-flow in the
    
    middle of blocks. We need to use nir_foreach_instr_safe to ensure that
    
    we iterate over instructions correctly anyway.
    
    
    
    Signed-off-by: Lionel Landwerlin <[hidden email]>
    
    Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access")
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit e04cf0b61269ca60b3260d81d94e625965d39901)
    
    
  • 7fa89fd9
    by Eric Engestrom at 2019-05-16T17:20:13Z
    util/os_file: always use the 'grow' mechanism
    
    
    
    Use fstat() only to pre-allocate a big enough buffer.
    
    
    
    This fixes a race where if the file grows between fstat() and read()
    
    we would be missing the end of the file, and if the file slims down
    
    read() would just fail.
    
    
    
    Fixes: 316964709e21286c2af5 "util: add os_read_file() helper"
    
    Reported-by: Jason Ekstrand <[hidden email]>
    
    Signed-off-by: Eric Engestrom <[hidden email]>
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 22c1657d0552be0c558ca805c8d574e92f53c2cc)
    
    
  • b551be82
    by Marek Olšák at 2019-05-17T07:41:15Z
    radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets
    
    
    
    This is a prerequisite for the next commit.
    
    
    
    Cc: 19.1 <[hidden email]>
    
    (cherry picked from commit 0f1b070bad34c46c4bcc6c679fa533bf6b4b79e5)
    
    
  • 5bed00cf
    by Caio Marcelo de Oliveira Filho at 2019-05-21T08:42:32Z
    nir: Fix nir_opt_idiv_const when negatives are involved
    
    
    
    First, allow the case for negative powers of two.  Then ensure that we
    
    use the absolute value of the non-constant value to calculate the
    
    quotient -- this was hinted in the code by the name 'uq'.
    
    
    
    This fixes an issue when 'd' is positive and 'n' is negative.  The
    
    ishr will propagate the negative sign and we'll use nir_ineg() again,
    
    incorrectly.
    
    
    
    v2: First version used only ishr, but that isn't sufficient, since it
    
        never can produce a zero as a result.  (Jason)
    
        Allow negative powers of two.  (Caio)
    
    
    
    Fixes: 74492ebad94 "nir: Add a pass for lowering integer division by constants"
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 8a995f2b5e1e3f2a2eafd32870ebfb43b5cfdf27)
    
    
  • f69eb770
    by Nanley Chery at 2019-05-21T08:42:32Z
    anv: Fix some depth buffer sampling cases on ICL+
    
    
    
    Don't attempt sampling with HiZ if the sampler lacks support for it. On
    
    ICL, the HW docs state that sampling with HiZ is not supported and that
    
    instances of AUX_HIZ in the RENDER_SURFACE_STATE object will be
    
    interpreted as AUX_NONE.
    
    
    
    Cc: <[hidden email]>
    
    Reviewed-by: Lionel Landwerlin <[hidden email]>
    
    Reviewed-by: Anuj Phogat <[hidden email]>
    
    (cherry picked from commit 629806b55bccd7f3e5b7b753820c4442fdb30bbe)
    
    
  • d08fde8e
    by Dave Airlie at 2019-05-21T08:42:32Z
    glsl: init packed in more constructors.
    
    
    
    src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls.
    
    
    
    from Coverity.
    
    
    
    Fixes: 659f333b3a4 (glsl: add packed for struct types)
    
    
    
    Acked-by: Ilia Mirkin <[hidden email]>
    
    (cherry picked from commit b2d4d08a5cae29759bdbd4ac4e942ea372fe7735)
    
    
  • dab3945f
    by Gert Wollny at 2019-05-21T08:42:32Z
    Revert "softpipe/buffer: load only as many components as the the buffer resource type provides"
    
    
    
    This reverts commit 865b9ddae4874186182e529b5fd154ab04a61f79.
    
    
    
    The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only
    
    one component would be supported. The original issue is still relevant, but
    
    the fix should be different.
    
    
    
    Signed-off-by: Gert Wollny <[hidden email]>
    
    Reviewed-by: Dave Airlie <[hidden email]>
    
    (cherry picked from commit 0f598ed7b3d2b3886ea5d742e7b0ced2b1702f28)
    
    
  • 8dbdeb27
    by Samuel Pitoiset at 2019-05-21T08:42:32Z
    radv: add a workaround for Monster Hunter World and LLVM 7&8
    
    
    
    The load/store optimizer pass doesn't handle WaW hazards correctly
    
    and this is the root cause of the reflection issue with Monster
    
    Hunter World. AFAIK, it's the only game that are affected by this
    
    issue.
    
    
    
    This is fixed with LLVM r361008, but we need a workaround for older
    
    LLVM versions unfortunately.
    
    
    
    Cc: "19.0" "19.1" <[hidden email]>
    
    Signed-off-by: Samuel Pitoiset <[hidden email]>
    
    Reviewed-by: Bas Nieuwenhuizen <[hidden email]>
    
    (cherry picked from commit d7501834cd86f9ec0b7c3dea17448dc523e36390)
    
    
  • 5d05324e
    by Jason Ekstrand at 2019-05-21T08:42:32Z
    anv: Emulate texture swizzle in the shader when needed
    
    
    
    Now that we have the descriptor buffer mechanism, emulated texture
    
    swizzle can be implemented in a very non-invasive way.  Previous
    
    attempts all tried to extend the push constant based image param
    
    mechanism which was gross.  This could, in theory, be done much faster
    
    with a magic back-end instruction which does indirect MOVs but Vulkan on
    
    IVB is already so slow this isn't going to matter much.
    
    
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104355
    
    Cc: "19.1" <[hidden email]>
    
    Reviewed-by: Lionel Landwerlin <[hidden email]>
    
    (cherry picked from commit d2aa65eb1892f7b300ac24560f9dbda6b600b5a7)
    
    
  • b6778c9f
    by Neha Bhende at 2019-05-21T08:42:32Z
    draw: fix memory leak introduced 7720ce32a
    
    
    
    We need to free memory allocation PrimitiveOffsets in draw_gs_destroy().
    
    This fixes memory leak found while running piglit on windows.
    
    
    
    Fixes: 7720ce32a ("draw: add support to tgsi paths for geometry streams. (v2)")
    
    
    
    Tested with piglit
    
    
    
    Reviewed-by: Brian Paul <[hidden email]>
    
    Reviewed-by: Charmaine Lee <[hidden email]>
    
    Reviewed-by: Dave Airlie <[hidden email]>
    
    (cherry picked from commit 926a6a35cf731552033224840c5b3e3edfd0131c)
    
    
  • 260f517d
    by Jason Ekstrand at 2019-05-21T08:42:32Z
    anv: Stop forcing bindless for images
    
    
    
    This was an unintended artifact of my testing of bindless images.  We
    
    should be choosing bindless or not dynamically.
    
    
    
    Fixes: c0d9926df7d "anv: Use bindless handles for images"
    
    Reviewed-by: Caio Marcelo de Oliveira Filho <[hidden email]>
    
    Reviewed-by: Lionel Landwerlin <[hidden email]>
    
    (cherry picked from commit 8413fd136c5f82cf8742ea306e139ac5d7bc18f3)
    
    
  • 2040f10c
    by Jason Ekstrand at 2019-05-21T08:42:32Z
    anv: Only consider minSampleShading when sampleShadingEnable is set
    
    
    
    >From the Vulkan 1.1.107 spec:
    
    
    
        Sample shading is enabled for a graphics pipeline:
    
    
    
          - If the interface of the fragment shader entry point of the
    
            graphics pipeline includes an input variable decorated with
    
            SampleId or SamplePosition. In this case minSampleShadingFactor
    
            takes the value 1.0.
    
    
    
          - Else if the sampleShadingEnable member of the
    
            VkPipelineMultisampleStateCreateInfo structure specified when
    
            creating the graphics pipeline is set to VK_TRUE. In this case
    
            minSampleShadingFactor takes the value of
    
            VkPipelineMultisampleStateCreateInfo::minSampleShading.
    
    
    
        Otherwise, sample shading is considered disabled.
    
    
    
    In other words, if sampleShadingEnable is set to VK_FALSE, we should
    
    ignore minSampleShading.
    
    
    
    Cc: [hidden email]
    
    Reviewed-by: Lionel Landwerlin <[hidden email]>
    
    (cherry picked from commit 1c92358bd89313b0cf7bf7b84992a28f11b2aa8f)
    
    
  • 6bac1a04
    by Eric Engestrom at 2019-05-21T08:42:32Z
    meson: expose glapi through osmesa
    
    
    
    Suggested-by: Pierre Guillou <[hidden email]>
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659
    
    Fixes: f121a669c7d94d2ff672 "meson: build gallium based osmesa"
    
    Fixes: cbbd5bb889a2c271a504 "meson: build classic osmesa"
    
    Cc: Brian Paul <[hidden email]>
    
    Cc: Dylan Baker <[hidden email]>
    
    Signed-off-by: Eric Engestrom <[hidden email]>
    
    Tested-by: Chuck Atkins <[hidden email]>
    
    (cherry picked from commit ccb8ea7acfb710c6c5298f3ffcadbe3d79b9b913)
    
    
  • 857210b0
    by Juan A. Suarez Romero at 2019-05-21T08:54:23Z
    cherry-ignore: radeonsi: update buffer descriptors in all contexts after buffer invalidation
    
    
    
    stable: this commit causes issues in several systems.
    
    
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    
  • 04e9d7bf
    by Charmaine Lee at 2019-05-21T08:59:28Z
    st/mesa: purge framebuffers with current context after unbinding winsys buffers
    
    
    
    With commit c89e8470e58, framebuffers are purged after unbinding context,
    
    but this change also introduces a heap corruption when running Rhino application
    
    on VMware svga device. Instead of purging the framebuffers after the context
    
    is unbound, this patch first ubinds the winsys buffers, then purges the framebuffers
    
    with the current context, and then finally unbinds the context.
    
    
    
    This fixes heap corruption.
    
    
    
    Cc: [hidden email]
    
    Reviewed-by: Brian Paul <[hidden email]>
    
    (cherry picked from commit b480adfa5ee224528eaed7e1da934a2d3e2b94d6)
    
    
  • 2153c3ae
    by Charmaine Lee at 2019-05-21T09:02:06Z
    mesa: unreference current winsys buffers when unbinding winsys buffers
    
    
    
    This fixes surface leak when no winsys buffers are bound.
    
    
    
    Cc: [hidden email]
    
    Reviewed-by: Brian Paul <[hidden email]>
    
    (cherry picked from commit 12bf7cfecf52083c484602f971738475edfe497e)
    
    
  • ab75e1e2
    by Caio Marcelo de Oliveira Filho at 2019-05-21T09:04:42Z
    nir: Fix clone of nir_variable state slots
    
    
    
    When num_state_slots is 0, don't create the array.  This was
    
    triggering the following assert when running vkcube with
    
    NIR_TEST_CLONE=1
    
    
    
        vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66:
    
        split_variable: Assertion `var->state_slots == NULL' failed.
    
    
    
    Fixes: 9fbd390dd4b "nir: Add support for cloning shaders"
    
    Reviewed-by: Jason Ekstrand <[hidden email]>
    
    (cherry picked from commit 005cc9ae37ca45960d87389dc9eace5ed29d1b99)
    
    
  • 1dd62eb6
    by Juan A. Suarez Romero at 2019-05-21T14:09:14Z
    Update version to 19.1.0-rc3
    
    
    
    Signed-off-by: Juan A. Suarez Romero <[hidden email]>
    
    

30 changed files:

The diff was not included because it is too large.