All-Flash NVMe Pool. Slow Read.

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Over the past year, I've had a good experience with the Truenas Scale.
Good data stability, adequate performance. It utilized six 14TB SAS HDDs and other SLOG disks, and operated satisfactorily under 10G to 40G network configurations.
This time, I have configured a new system to configure all-flash storage, which I have been interested in and targeted for a long time.
  • System: Dell PowerEdge R640 10-Bay with NVMe enabled configuration
  • DISK (for boot): 200GB SAS SSD x 2
  • DISK (for DATA): 3.84TB U.2 NVMe x 8 (Micron 7300 Pro)
  • MEM: 64GB ECC/REG
  • CPU: Xeon Silver 4114 x 2 (2cpu configuration)
  • NIC: Intel i350+X550 rNDC, Mellanox ConnectX-4 100GbE Dual Port
I configured 8 disks in RAIDZ1 and I disabled Sync.
I left the rest of the settings as the default.
And the result is an unknown slow reading speed.

I'm not interested in benchmarks, so it's not a matter of that outcome.
I just replicated from the existing HDD-based Truenas system to the new all-flash system, and i are looking for the cause of this, showing 50 to 100 MB/s per second to read data from the client equipment after replication.

The client communicates with the system through a 10GbE NIC, and writes at a virtual maximum bandwidth of 10GbE.
Yes, it's about 1Gb/s per second and I know this is the result of write caching in RAM and NVMe's strengths working fantastically.

Now to find out why all-flash pools are slow to read, I spent about four days searching communities like Google and Truenas Forums, Reddit, etc., trying to find and analyze the cause, but I couldn't find it.

Search for data and cases for approximately 4 days and list possible causes.
1. ARC Issues
2. Low capacity RAM
3. Inconsistent ZFS and NVMe configurations
4. Cursed.
First, in the case of ARC, i checked that the hit ratio dropped during the reading operation. (Check on right end of graph)
1685869276276.png

1685869290240.png

1685869303184.png


And as expected, ZFS doesn't seem to be actively reading disks.
1685869245632.png


I tried to apply the Autotune function provided by Truenas, but it was not improved.
The test was to copy approximately 60GB of Zip files from the Pool to the client, and nothing else was done.
Actually, the system was originally installed on 'Scale', but when there was a problem with the reading speed, i reinstalled with 'Core'.
But I feel hopeless that the problem remains the same.
I using SMB and iSCSI. Both have the same problem.

I know that my configuration is lower than the usual enterprise-level specifications.
But most of my workloads are just storing cold data in a pool, sometimes browsing videos and photos.
I think it would be more appropriate to describe it as midline storage for just me.

So I thought I could enjoy it purely with NVMe, fast NICs, Switch, and so on with its own performance.
However, there is an unexpected problem, so I would like to get your opinion and help.
I actually thought there would be a problem with writing speed, but the real problem is reading speed.

If you tell me what I missed and what more I need to check, I'll try.
 
Last edited:

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
More information.
really funny part is when i connect SMB with 1GbE NIC, read speed is more getting worse than 10GbE. (2MB/s ~ 10MB/s. LMAO)
 
Joined
Dec 29, 2014
Messages
1,135
I am just about to run out the door for a business trip, but two things strike me about your configuration. The first and possibly most important is that I think the amount of RAM you have is low for the number and size of your drives. Unless your data access patterns are totally random, the ARC will eventually get populated with the data that is being used most often. I that isn't happening, there are 2 likely causes. You either don't have enough RAM, or something else (like running virtual machines) is taking it. Worse still for the second case, it could be causing you to do memory swapping. The other item is your pool construction. Others can elaborate on this better than I can, but single RAIDZ1 vdev ends up limiting you to the speed of a single drive. You would have to destroy and recreate the pool, but you might get better performance if you built your pool with (4) vdevs of (2) drive mirrors.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I configured 8 disks in RAIDZ1 and I disabled Sync.
I believe you mean sync writes?

The test was to copy approximately 60GB of Zip files from the Pool to the client, and nothing else was done
I would try with large video files (after a reboot in order to clear ARC) or with jgreco's solnet array.

You could also repeat the zip test and share the arc_summary output.

I believe the issue might be low ARC size along with your pool structure: either increase RAM, use L2ARC or change pool structure.

Edited for spelling correction.
 
Last edited:

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
I believe you mean sync writes?


I would try with large video files (after a reboot in order to clear ARC) or with jgreco's solnet array.

You could also repeat the zip test and share the arc_summary output.

I believe the issue might be low ARC size: either increase it or use L2ARC.
Also your pool strutture does have an impact on performance, but I don't think it's the main issue here.
Ah yes, i said 'Sync' means 'Sync writes'.
And even while I was writing this post, I continued to do the copy test you mentioned.
Same file, same location. The result is the same.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
I am just about to run out the door for a business trip, but two things strike me about your configuration. The first and possibly most important is that I think the amount of RAM you have is low for the number and size of your drives. Unless your data access patterns are totally random, the ARC will eventually get populated with the data that is being used most often. I that isn't happening, there are 2 likely causes. You either don't have enough RAM, or something else (like running virtual machines) is taking it. Worse still for the second case, it could be causing you to do memory swapping. The other item is your pool construction. Others can elaborate on this better than I can, but single RAIDZ1 vdev ends up limiting you to the speed of a single drive. You would have to destroy and recreate the pool, but you might get better performance if you built your pool with (4) vdevs of (2) drive mirrors.
I understand that a single Vdev means one disk. Is that right?
Now we're showing less performance than a single 7300 pro.

I heard that I need 1GB of RAM per TiB for the data I acquired before.
So I configured it with 64GB, is it different from the truth?
Of course, considering that this configuration is NVMe, we recognize that it is a small amount of RAM.
However, I thought it would be okay to consider using it for a very light purpose when comparing it to a real enterprise environment.
And I don't use Jail or additional applications through Truenas.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
You could also repeat the zip test and share the arc_summary output.
Here is arc_summary output.
Code:
root@storage02[~]# arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report                            Sun Jun 04 19:07:52 2023
FreeBSD 13.1-RELEASE-p7                                    zpl version 5
Machine: storage02.linuchan.moe (amd64)                 spa version 5000

ARC status:                                                      HEALTHY
        Memory throttle count:                                         0

ARC size (current):                                    84.5 %   48.4 GiB
        Target size (adaptive):                        84.5 %   48.3 GiB
        Min size (hard limit):                          3.5 %    2.0 GiB
        Max size (high water):                           28:1   57.2 GiB
        Most Frequently Used (MFU) cache size:         57.8 %   27.9 GiB
        Most Recently Used (MRU) cache size:           42.2 %   20.3 GiB
        Metadata cache size (hard limit):              75.0 %   42.9 GiB
        Metadata cache size (current):                  0.6 %  246.5 MiB
        Dnode cache size (hard limit):                 10.0 %    4.3 GiB
        Dnode cache size (current):                     0.5 %   20.5 MiB

ARC hash breakdown:
        Elements max:                                             524.7k
        Elements current:                             100.0 %     524.7k
        Collisions:                                                16.4k
        Chain max:                                                     3
        Chains:                                                    15.6k

ARC misc:
        Deleted:                                                      41
        Mutex misses:                                                157
        Eviction skips:                                             2.9k
        Eviction skips due to L2 writes:                               0
        L2 cached evictions:                                     0 Bytes
        L2 eligible evictions:                                  56.9 GiB
        L2 eligible MFU evictions:                     74.6 %   42.4 GiB
        L2 eligible MRU evictions:                     25.4 %   14.5 GiB
        L2 ineligible evictions:                                10.8 MiB

ARC total accesses (hits + misses):                                 7.5M
        Cache hit ratio:                               88.5 %       6.6M
        Cache miss ratio:                              11.5 %     866.3k
        Actual hit ratio (MFU + MRU hits):             86.0 %       6.5M
        Data demand efficiency:                        99.4 %       1.1M
        Data prefetch efficiency:                      17.8 %       1.0M

Cache hits by cache type:
        Most frequently used (MFU):                    85.1 %       5.7M
        Most recently used (MRU):                      12.1 %     803.9k
        Most frequently used (MFU) ghost:               3.3 %     221.4k
        Most recently used (MRU) ghost:                 1.7 %     115.3k

Cache hits by data type:
        Demand data:                                   16.3 %       1.1M
        Prefetch data:                                  2.8 %     183.1k
        Demand metadata:                               80.9 %       5.4M
        Prefetch metadata:                              0.1 %       4.9k

Cache misses by data type:
        Demand data:                                    0.8 %       6.9k
        Prefetch data:                                 97.7 %     846.1k
        Demand metadata:                                0.7 %       6.2k
        Prefetch metadata:                              0.8 %       7.2k

DMU prefetch efficiency:                                          130.7k
        Hit ratio:                                     99.0 %     129.3k
        Miss ratio:                                     1.0 %       1.4k

L2ARC not detected, skipping section

Tunables:
        abd_scatter_enabled                                            1
        abd_scatter_min_size                                        4097
        allow_redacted_dataset_mount                                   0
        anon_data_esize                                                0
        anon_metadata_esize                                            0
        anon_size                                                      0
        arc.average_blocksize                                       8192
        arc.dnode_limit                                                0
        arc.dnode_limit_percent                                       10
        arc.dnode_reduce_percent                                      10
        arc.evict_batch_limit                                         10
        arc.eviction_pct                                             200
        arc.grow_retry                                                 0
        arc.lotsfree_percent                                          10
        arc.max                                              61434000000
        arc.meta_adjust_restarts                                    4096
        arc.meta_limit                                                 0
        arc.meta_limit_percent                                        75
        arc.meta_min                                                   0
        arc.meta_prune                                             10000
        arc.meta_strategy                                              1
        arc.min                                                        0
        arc.min_prefetch_ms                                            0
        arc.min_prescient_prefetch_ms                                  0
        arc.p_dampener_disable                                         1
        arc.p_min_shift                                                0
        arc.pc_percent                                                 0
        arc.prune_task_threads                                         1
        arc.shrink_shift                                               0
        arc.sys_free                                                   0
        arc_free_target                                           345758
        arc_max                                              61434000000
        arc_min                                                        0
        arc_no_grow_shift                                              5
        async_block_max_blocks                      18446744073709551615
        autoimport_disable                                             1
        btree_verify_intensity                                         0
        ccw_retry_interval                                           300
        checksum_events_per_second                                    20
        commit_timeout_pct                                             5
        compressed_arc_enabled                                         1
        condense.indirect_commit_entry_delay_ms                        0
        condense.indirect_obsolete_pct                                25
        condense.indirect_vdevs_enable                                 1
        condense.max_obsolete_bytes                           1073741824
        condense.min_mapping_bytes                                131072
        condense_pct                                                 200
        crypt_sessions                                                 0
        dbgmsg_enable                                                  1
        dbgmsg_maxsize                                           4194304
        dbuf.cache_shift                                               5
        dbuf.metadata_cache_max_bytes               18446744073709551615
        dbuf.metadata_cache_shift                                      6
        dbuf_cache.hiwater_pct                                        10
        dbuf_cache.lowater_pct                                        10
        dbuf_cache.max_bytes                        18446744073709551615
        dbuf_state_index                                               0
        ddt_data_is_special                                            1
        deadman.checktime_ms                                       60000
        deadman.enabled                                                1
        deadman.failmode                                            wait
        deadman.synctime_ms                                       600000
        deadman.ziotime_ms                                        300000
        debug                                                          0
        debugflags                                                     0
        dedup.prefetch                                                 0
        default_bs                                                     9
        default_ibs                                                   15
        delay_min_dirty_percent                                       60
        delay_scale                                               500000
        dirty_data_max                                        4294967296
        dirty_data_max_max                                    4294967296
        dirty_data_max_max_percent                                    25
        dirty_data_max_percent                                        10
        dirty_data_sync_percent                                       20
        disable_ivset_guid_check                                       0
        dmu_object_alloc_chunk_shift                                   7
        dmu_offset_next_sync                                           1
        dmu_prefetch_max                                       134217728
        dtl_sm_blksz                                                4096
        embedded_slog_min_ms                                          64
        flags                                                          0
        fletcher_4_impl [fastest] scalar superscalar superscalar4 sse2 ssse3 avx2 avx512f
        free_bpobj_enabled                                             1
        free_leak_on_eio                                               0
        free_min_time_ms                                            1000
        history_output_max                                       1048576
        immediate_write_sz                                         32768
        initialize_chunk_size                                    1048576
        initialize_value                            16045690984833335022
        keep_log_spacemaps_at_export                                   0
        l2arc.exclude_special                                          0
        l2arc.feed_again                                               1
        l2arc.feed_min_ms                                            200
        l2arc.feed_secs                                                1
        l2arc.headroom                                                 2
        l2arc.headroom_boost                                         200
        l2arc.meta_percent                                            33
        l2arc.mfuonly                                                  0
        l2arc.noprefetch                                               0
        l2arc.norw                                                     0
        l2arc.rebuild_blocks_min_l2size                       1073741824
        l2arc.rebuild_enabled                                          0
        l2arc.trim_ahead                                               0
        l2arc.write_boost                                       40000000
        l2arc.write_max                                         10000000
        l2arc_feed_again                                               1
        l2arc_feed_min_ms                                            200
        l2arc_feed_secs                                                1
        l2arc_headroom                                                 2
        l2arc_noprefetch                                               0
        l2arc_norw                                                     0
        l2arc_write_boost                                       40000000
        l2arc_write_max                                         10000000
        l2c_only_size                                                  0
        livelist.condense.new_alloc                                    0
        livelist.condense.sync_cancel                                  0
        livelist.condense.sync_pause                                   0
        livelist.condense.zthr_cancel                                  0
        livelist.condense.zthr_pause                                   0
        livelist.max_entries                                      500000
        livelist.min_percent_shared                                   75
        lua.max_instrlimit                                     100000000
        lua.max_memlimit                                       104857600
        max_async_dedup_frees                                     100000
        max_auto_ashift                                               14
        max_dataset_nesting                                           50
        max_log_walking                                                5
        max_logsm_summary_length                                      10
        max_missing_tvds                                               0
        max_missing_tvds_cachefile                                     2
        max_missing_tvds_scan                                          0
        max_nvlist_src_size                                            0
        max_recordsize                                           1048576
        metaslab.aliquot                                         1048576
        metaslab.bias_enabled                                          1
        metaslab.debug_load                                            0
        metaslab.debug_unload                                          0
        metaslab.df_alloc_threshold                               131072
        metaslab.df_free_pct                                           4
        metaslab.df_max_search                                  16777216
        metaslab.df_use_largest_segment                                0
        metaslab.find_max_tries                                      100
        metaslab.force_ganging                                  16777217
        metaslab.fragmentation_factor_enabled                          1
        metaslab.fragmentation_threshold                              70
        metaslab.lba_weighting_enabled                                 1
        metaslab.load_pct                                             50
        metaslab.max_size_cache_sec                                 3600
        metaslab.mem_limit                                            25
        metaslab.preload_enabled                                       1
        metaslab.preload_limit                                        10
        metaslab.segment_weight_enabled                                1
        metaslab.sm_blksz_no_log                                   16384
        metaslab.sm_blksz_with_log                                131072
        metaslab.switch_threshold                                      2
        metaslab.try_hard_before_gang                                  0
        metaslab.unload_delay                                         32
        metaslab.unload_delay_ms                                  600000
        mfu_data_esize                                       27499166720
        mfu_ghost_data_esize                                 16689725440
        mfu_ghost_metadata_esize                                28829696
        mfu_ghost_size                                       16718555136
        mfu_metadata_esize                                      16240128
        mfu_size                                             29924290560
        mg.fragmentation_threshold                                    95
        mg.noalloc_threshold                                           0
        min_auto_ashift                                                9
        min_metaslabs_to_flush                                         1
        mru_data_esize                                       21074184704
        mru_ghost_data_esize                                   326181888
        mru_ghost_metadata_esize                               152792064
        mru_ghost_size                                         478973952
        mru_metadata_esize                                         76800
        mru_size                                             21823943680
        multihost.fail_intervals                                      10
        multihost.history                                              0
        multihost.import_intervals                                    20
        multihost.interval                                          1000
        multilist_num_sublists                                         0
        no_scrub_io                                                    0
        no_scrub_prefetch                                              0
        nocacheflush                                                   0
        nopwrite_enabled                                               1
        obsolete_min_time_ms                                         500
        pd_bytes_max                                            52428800
        per_txg_dirty_frees_percent                                   30
        prefetch.array_rd_sz                                     1048576
        prefetch.disable                                               0
        prefetch.max_distance                                   67108864
        prefetch.max_idistance                                  67108864
        prefetch.max_sec_reap                                          2
        prefetch.max_streams                                           8
        prefetch.min_distance                                    4194304
        prefetch.min_sec_reap                                          1
        read_history                                                   0
        read_history_hits                                              0
        rebuild_max_segment                                      1048576
        rebuild_scrub_enabled                                          1
        rebuild_vdev_limit                                      33554432
        reconstruct.indirect_combinations_max                       4096
        recover                                                        0
        recv.queue_ff                                                 20
        recv.queue_length                                       16777216
        recv.write_batch_size                                    1048576
        removal_suspend_progress                                       0
        remove_max_segment                                      16777216
        resilver_disable_defer                                         0
        resilver_min_time_ms                                        3000
        scan_blkstats                                                  0
        scan_checkpoint_intval                                      7200
        scan_fill_weight                                               3
        scan_ignore_errors                                             0
        scan_issue_strategy                                            0
        scan_legacy                                                    0
        scan_max_ext_gap                                         2097152
        scan_mem_lim_fact                                             20
        scan_mem_lim_soft_fact                                        20
        scan_strict_mem_lim                                            0
        scan_suspend_progress                                          0
        scan_vdev_limit                                          4194304
        scrub_min_time_ms                                           1000
        send.corrupt_data                                              0
        send.no_prefetch_queue_ff                                     20
        send.no_prefetch_queue_length                            1048576
        send.override_estimate_recordsize                              0
        send.queue_ff                                                 20
        send.queue_length                                       16777216
        send.unmodified_spill_blocks                                   1
        send_holes_without_birth_time                                  1
        slow_io_events_per_second                                     20
        spa.asize_inflation                                           24
        spa.discard_memory_limit                                16777216
        spa.load_print_vdev_tree                                       0
        spa.load_verify_data                                           1
        spa.load_verify_metadata                                       1
        spa.load_verify_shift                                          4
        spa.slop_shift                                                 5
        space_map_ibs                                                 14
        special_class_metadata_reserve_pct                            25
        standard_sm_blksz                                         131072
        super_owner                                                    0
        sync_pass_deferred_free                                        2
        sync_pass_dont_compress                                        8
        sync_pass_rewrite                                              2
        sync_taskq_batch_pct                                          75
        top_maxinflight                                             1000
        traverse_indirect_prefetch_limit                              32
        trim.extent_bytes_max                                  134217728
        trim.extent_bytes_min                                      32768
        trim.metaslab_skip                                             0
        trim.queue_limit                                              10
        trim.txg_batch                                                32
        txg.history                                                  100
        txg.timeout                                                    5
        unflushed_log_block_max                                   131072
        unflushed_log_block_min                                     1000
        unflushed_log_block_pct                                      400
        unflushed_log_txg_max                                       1000
        unflushed_max_mem_amt                                 1073741824
        unflushed_max_mem_ppm                                       1000
        user_indirect_is_special                                       1
        validate_skip                                                  0
        vdev.aggregate_trim                                            0
        vdev.aggregation_limit                                   1048576
        vdev.aggregation_limit_non_rotating                       131072
        vdev.async_read_max_active                                     3
        vdev.async_read_min_active                                     1
        vdev.async_write_active_max_dirty_percent                     60
        vdev.async_write_active_min_dirty_percent                     30
        vdev.async_write_max_active                                    5
        vdev.async_write_min_active                                    1
        vdev.bio_delete_disable                                        0
        vdev.bio_flush_disable                                         0
        vdev.cache_bshift                                             16
        vdev.cache_max                                             16384
        vdev.cache_size                                                0
        vdev.def_queue_depth                                          32
        vdev.default_ms_count                                        200
        vdev.default_ms_shift                                         29
        vdev.file.logical_ashift                                       9
        vdev.file.physical_ashift                                      9
        vdev.initializing_max_active                                   1
        vdev.initializing_min_active                                   1
        vdev.max_active                                             1000
        vdev.max_auto_ashift                                          14
        vdev.min_auto_ashift                                           9
        vdev.min_ms_count                                             16
        vdev.mirror.non_rotating_inc                                   0
        vdev.mirror.non_rotating_seek_inc                              1
        vdev.mirror.rotating_inc                                       0
        vdev.mirror.rotating_seek_inc                                  5
        vdev.mirror.rotating_seek_offset                         1048576
        vdev.ms_count_limit                                       131072
        vdev.nia_credit                                                5
        vdev.nia_delay                                                 5
        vdev.queue_depth_pct                                        1000
        vdev.read_gap_limit                                        32768
        vdev.rebuild_max_active                                        3
        vdev.rebuild_min_active                                        1
        vdev.removal_ignore_errors                                     0
        vdev.removal_max_active                                        2
        vdev.removal_max_span                                      32768
        vdev.removal_min_active                                        1
        vdev.removal_suspend_progress                                  0
        vdev.remove_max_segment                                 16777216
        vdev.scrub_max_active                                          3
        vdev.scrub_min_active                                          1
        vdev.sync_read_max_active                                     10
        vdev.sync_read_min_active                                     10
        vdev.sync_write_max_active                                    10
        vdev.sync_write_min_active                                    10
        vdev.trim_max_active                                           2
        vdev.trim_min_active                                           1
        vdev.validate_skip                                             0
        vdev.write_gap_limit                                        4096
        version.acl                                                    1
        version.ioctl                                                 15
        version.module                         v2023051000-zfs_0a06f128c
        version.spa                                                 5000
        version.zpl                                                    5
        vnops.read_chunk_size                                    1048576
        vol.mode                                                       2
        vol.recursive                                                  0
        vol.unmap_enabled                                              1
        wrlog_data_max                                        8589934592
        xattr_compat                                                   1
        zap_iterate_prefetch                                           1
        zevent.len_max                                               512
        zevent.retain_expire_secs                                    900
        zevent.retain_max                                           2000
        zfetch.max_distance                                     67108864
        zfetch.max_idistance                                    67108864
        zil.clean_taskq_maxalloc                                 1048576
        zil.clean_taskq_minalloc                                    1024
        zil.clean_taskq_nthr_pct                                     100
        zil.maxblocksize                                          131072
        zil.min_commit_timeout                                      5000
        zil.nocacheflush                                               0
        zil.replay_disable                                             0
        zil.slog_bulk                                             786432
        zio.deadman_log_all                                            0
        zio.dva_throttle_enabled                                       1
        zio.exclude_metadata                                           0
        zio.requeue_io_start_cut_in_line                               1
        zio.slow_io_ms                                             30000
        zio.taskq_batch_pct                                           80
        zio.taskq_batch_tpq                                            0
        zio.use_uma                                                    1

VDEV cache disabled, skipping section

ZIL committed transactions:                                          788
        Commit requests:                                              33
        Flushes to stable storage:                                    33
        Transactions to SLOG storage pool:            0 Bytes          0
        Transactions to non-SLOG storage pool:        1.6 MiB         35
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
I believe the issue might be low ARC size along with your pool structure: either increase RAM, use L2ARC or change pool structure.
First of all, I'm a new user, so I don't have permission to edit, so please understand about writing multiple replies.

I can't reduce it because problem of available capacity. (I need 20TB+)
Maybe it's time to give up the boot-pool mirror and put one Optane disk in and expand RAM.
Because the system has 10-Bay and already has eight U.2s and two SAS SSDs for boot-pool.
I've heard before that I need 1GB of RAM per TiB, so I thought this would be enough.
One thing I'm worried about is whether this problem will be solved when I invest in it additionally.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I understand that a single Vdev means one disk. Is that right?
Now we're showing less performance than a single 7300 pro.
That's likely the ARC issue.

I heard that I need 1GB of RAM per TiB for the data I acquired before.
So I configured it with 64GB, is it different from the truth?
That's on top of the minimum requirements (8/16GB). I did not do the math, if you respected the 1GB for TB of storage you should be ok.

I'd still suggest to run the solnet array.

Anyway, wait for more qualified opinions. Mine are from guesswork rather than hands down experience.

Edit: might find interesting the following resources.
 
Last edited:

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Even now, I'm analyzing the cause of slow reading, but there's something I don't understand.
Why is it slower than my previous 14TB SAS HDD x6? same RAIDZ1
Other than adding a separate SLOG device for writing, there was no additional tuning.
It's even a system with 32GB of RAM.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Back to basics: how are you connecting the drives and the motherboard?
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Back to basics: how are you connecting the drives and the motherboard?
1685879640965.png

On the R640 the backplane and motherboard are connected via a Slimline SAS cable.
The corresponding PCIe Lane is available from CPU2.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Honestly I don't have much more to tell you other than trying @jgreco's solnet array as a troubleshoot step.
Can't understand what could be the issue here, sorry.

Maybe @Etorix, @joeschmuck or someone else can help you.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Honestly I don't have much more to tell you other than trying @jgreco's solnet array as a troubleshoot step.
Can't understand what could be the issue here, sorry.

Maybe @Etorix, @joeschmuck or someone else can help you.
Thank you for letting me know your opinion.
The test is currently being conducted in one-pass and is expected to take a long time, so I will write more when the results come out.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Honestly I don't have much more to tell you other than trying @jgreco's solnet array as a troubleshoot step.
allllright, my machine has spit out result.
Here it is.

Code:
root@storage02[~]# ./solnet-array-test-v3.sh
sol.net disk array test v3

This is a nondestructive (read-only) full disk test designed to help
diagnose performance irregularities and to assist with disk burn-in

1) Use all disks (from camcontrol)
2) Use selected disks (from camcontrol|grep)
3) Specify disks   
4) Show camcontrol list

Option: 3

Enter disk devices separated by spaces (e.g. da1 da2): nvd0 nvd1 nvd2 nvd3 nvd4 nvd5 nvd6 nvd7

Selected disks: nvd0 nvd1 nvd2 nvd3 nvd4 nvd5 nvd6 nvd7
Is this correct? (y/N): y

You can select one-pass for the traditional once-thru mode, or
burn-in mode to keep looping forever.

One-pass or Burn-in mode? (o/B): o
Performing initial serial array read (baseline speeds)
Mon Jun  5 00:48:11 KST 2023
Mon Jun  5 01:06:14 KST 2023
Completed: initial serial array read (baseline speeds)

This test checks to see how fast one device at a time is.  If all
your disks are the same type and attached in the same manner. they
should be of similar speeds.  Each individual disk will now be
compared to the average speed.  Results that are unusually slow or
unusually fast may be tagged as such.  It is up to you to decide if
there is something wrong.

Array's average speed is 1893.38 MB/sec per disk

Disk    Disk Size  MB/sec %ofAvg
------- ---------- ------ ------
nvd0     3662830MB   1904    101
nvd1     3662830MB   1915    101
nvd2     3662830MB   1881     99
nvd3     3662830MB   1887    100
nvd4     3662830MB   1872     99
nvd5     3662830MB   1899    100
nvd6     3662830MB   1883     99
nvd7     3662830MB   1906    101

This next test attempts to read all devices in parallel.  This is
primarily a stress test of your disk controller, but may also find
limits in your PCIe bus, SAS expander topology, etc.  Ideally, if
all of your disks are of the same type and connected the same way,
then all of your disks should be able to read their contents in
about the same amount of time.  Results that are unusually slow or
unusually fast may be tagged as such.  It is up to you to decide if
there is something wrong.

Performing initial parallel array read
Mon Jun  5 01:06:14 KST 2023
The disk nvd0 appears to be 3662830 MB.
Disk is reading at about 2122 MB/sec
This suggests that this pass may take around 29 minutes

                   Serial Parall % of
Disk    Disk Size  MB/sec MB/sec Serial
------- ---------- ------ ------ ------
nvd0     3662830MB   1904   2141    112 ++FAST++
nvd1     3662830MB   1915   2118    111 ++FAST++nvd2     3662830MB   1881   2141    114 ++FAST++
nvd3     3662830MB   1887   2157    114 ++FAST++
nvd4     3662830MB   1872   2148    115 ++FAST++
nvd5     3662830MB   1899   2114    111 ++FAST++
nvd6     3662830MB   1883   2105    112 ++FAST++
nvd7     3662830MB   1906   2134    112 ++FAST++

Awaiting completion: initial parallel array read
Mon Jun  5 01:34:27 KST 2023
Completed: initial parallel array read

Disk's average time is 1686 seconds per disk

Disk    Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
nvd0        3840755982336    1692    100
nvd1        3840755982336    1681    100
nvd2        3840755982336    1688    100
nvd3        3840755982336    1681    100
nvd4        3840755982336    1690    100
nvd5        3840755982336    1677     99
nvd6        3840755982336    1690    100
nvd7        3840755982336    1685    100

This next test attempts to read all devices while forcing seeks.
This is primarily a stress test of your hard disks.  It does thhis
by running several simultaneous dd sessions on each disk.

Performing initial parallel seek-stress array read
Mon Jun  5 01:34:27 KST 2023
The disk nvd0 appears to be 3662830 MB.
Disk is reading at about 2932 MB/sec
This suggests that this pass may take around 21 minutes

                   Serial Parall % of
Disk    Disk Size  MB/sec MB/sec Serial
------- ---------- ------ ------ ------
nvd0     3662830MB   1904   2933    154
nvd1     3662830MB   1915   3015    157
nvd2     3662830MB   1881   2970    158
nvd3     3662830MB   1887   2934    155
nvd4     3662830MB   1872   3000    160
nvd5     3662830MB   1899   2933    154
nvd6     3662830MB   1883   2933    156
nvd7     3662830MB   1906   2959    155

Awaiting completion: initial parallel seek-stress array read
Mon Jun  5 03:42:56 KST 2023
Completed: initial parallel seek-stress array read

Disk's average time is 4680 seconds per disk

Disk    Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
nvd0        3840755982336    5449    116 --SLOW--
nvd1        3840755982336    5060    108 --SLOW--
nvd2        3840755982336    4967    106
nvd3        3840755982336    4764    102
nvd4        3840755982336    4561     97
nvd5        3840755982336    4480     96
nvd6        3840755982336    4301     92 ++FAST++
nvd7        3840755982336    3859     82 ++FAST++


It seems like NVMe disks doing job well.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Ok, i added more RAM.
Now on 256GB. But, nothing to change for this problem. same issue.
I found this https://github.com/openzfs/zfs/issues/8381 and tried it.


Code:
options zfs zfs_dirty_data_max_percent=30
options zfs zfs_txg_timeout=100
options zfs zfs_vdev_async_read_max_active=2048
options zfs zfs_vdev_async_read_min_active=1024
options zfs zfs_vdev_async_write_max_active=2048
options zfs zfs_vdev_async_write_min_active=1024
options zfs zfs_vdev_queue_depth_pct=100
options zfs zfs_vdev_sync_read_max_active=2048
options zfs zfs_vdev_sync_read_min_active=1024
options zfs zfs_vdev_sync_write_max_active=2048
options zfs zfs_vdev_sync_write_min_active=1024


But nothing improves.
 

Linuchan

Dabbler
Joined
Jun 4, 2023
Messages
27
Can you run an iperf test?
Sure.

Code:
Connecting to host 10.40.40.15, port 5201
[  5] local 10.40.40.21 port 56246 connected to 10.40.40.15 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.20 GBytes  10.3 Gbits/sec    0   1.49 MBytes       
[  5]   1.00-2.00   sec  1.23 GBytes  10.5 Gbits/sec    0   1.49 MBytes       
[  5]   2.00-3.00   sec  1.23 GBytes  10.6 Gbits/sec    0   1.77 MBytes       
[  5]   3.00-4.00   sec  1.21 GBytes  10.4 Gbits/sec    0   1.77 MBytes       
[  5]   4.00-5.00   sec  1.22 GBytes  10.5 Gbits/sec    0   1.77 MBytes       
[  5]   5.00-6.00   sec  1.15 GBytes  9.88 Gbits/sec    0   1.77 MBytes       
[  5]   6.00-7.00   sec  1.28 GBytes  11.0 Gbits/sec    0   1.77 MBytes       
[  5]   7.00-8.00   sec  1.26 GBytes  10.8 Gbits/sec    0   1.77 MBytes       
[  5]   8.00-9.00   sec  1.25 GBytes  10.8 Gbits/sec    0   1.77 MBytes       
[  5]   9.00-10.00  sec  1.27 GBytes  10.9 Gbits/sec    0   1.77 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  12.3 GBytes  10.6 Gbits/sec                  receiver

iperf Done.
linu@drive:~$ 
 
Top