1 (edited by mfleetwo 2012-09-30 11:51:04)

Topic: GParted device refresh speed

Hi,

I was thinking about the speed of device refresh in GParted.  If we want to make it faster we first need to know what is taking the time.  So i've hacked up this patch.  Definitely not for applying upstream.  You can get it with:

git -c http.sslverify=false pull https://rockover.homeip.net/cgit/gparted timing-devel1

At the end is the output from my desktop which has a single hard drive, no raid and 15 partitions.  Some observations of my data:

  • A few external commands are very time consuming:

    • 2.1s (mdadm) scanning for software raid.

    • 1.0s (udevadm) udev event waiting.

    • 0.7s (lvm) 3 commands scanning LVM.

  • Many of the file system specific query commands are executed twice on unmounted file systems, once for the sector usage and once for the label.

  • There is another second which passes before the execution of udevadm which isn't yet identified.


Few thoughts on ways to mitigate this:

  • Have a way to identify what to rescan depending on what has changed.  For example mount/umount operations only change the busy status of one file system, yet perform a full rescan.

    • Only execute hardware and software raid scanning and udev waiting on full Refresh Devices, not after completing every operation.

  • Keep [Ctrl-R] Refresh Devices performing a full rescan to mitigate missing changes performed simultaneously outside GParted, or bugs not scanning for changes initiated in GParted.

  • Within a single rescan, cache the output of all commands used to query information so subsequent executions can just read the cache.

  • Change ext2/3/4, and possibly other file systems too, so that they use the same command to query sectors and label to make better use of the command output cache.


Timing data:

  0.000000 +0.000000 STOPWATCH_RESET
  0.000000 +0.000000 find_supported_filesystems()   start
  0.006045 +0.006045 execute_command()              execute: sh -c btrfs filesystem label --help
  0.009035 +0.002990 execute_command()              exit status 0
  0.009807 +0.000773 execute_command()              execute: sh -c ntfslabel --help
  0.012314 +0.002506 execute_command()              exit status 256
  0.012722 +0.000408 find_supported_filesystems()   end
  0.117900 +0.105178 STOPWATCH_RESET
  0.000000 +0.000000 set_devices()                  start
  0.000030 +0.000030 set_devices()                  refreshing caches ...
  0.001112 +0.001082 execute_command()              execute: sh -c blkid
  0.219075 +0.217963 execute_command()              exit status 0
  0.219253 +0.000179 execute_command()              execute: sh -c dmraid -sa -c
  0.324554 +0.105301 execute_command()              exit status 256
  0.324650 +0.000096 execute_command()              execute: sh -c mdadm --examine --scan
  2.518364 +2.193714 execute_command()              exit status 0
  2.518478 +0.000114 execute_command()              execute: sh -c lvm vgscan
  3.000609 +0.482131 execute_command()              exit status 0
  3.000677 +0.000068 execute_command()              execute: sh -c lvm pvs --config "log{command_names=0}" --nosuffix --noheadings --separator , --units b -o pv_name,pv_size,pv_free,vg_name
  3.175259 +0.174581 execute_command()              exit status 0
  3.175321 +0.000063 execute_command()              execute: sh -c lvm pvs --config "log{command_names=0}" --nosuffix --noheadings --separator , --units b -o vg_name,vg_attr,lv_name,lv_attr
  3.194622 +0.019300 execute_command()              exit status 0
  3.194648 +0.000026 set_devices()                  refreshed caches
  3.614523 +0.419875 execute_command()              execute: sh -c e2label /dev/sda1
  3.617462 +0.002939 execute_command()              exit status 0
  3.630905 +0.013443 execute_command()              execute: sh -c e2label /dev/sda3
  3.633605 +0.002700 execute_command()              exit status 0
  3.645401 +0.011796 execute_command()              execute: sh -c e2label /dev/sda5
  3.648050 +0.002649 execute_command()              exit status 0
  3.654722 +0.006672 execute_command()              execute: sh -c btrfs filesystem show /dev/sda6
  3.897784 +0.243063 execute_command()              exit status 0
  3.898516 +0.000731 execute_command()              execute: sh -c e2label /dev/sda7
  3.910383 +0.011867 execute_command()              exit status 0
  3.910923 +0.000540 execute_command()              execute: sh -c e2label /dev/sda8
  3.914004 +0.003081 execute_command()              exit status 0
  3.914488 +0.000483 execute_command()              execute: sh -c e2label /dev/sda9
  3.926273 +0.011786 execute_command()              exit status 0
  3.926796 +0.000523 execute_command()              execute: sh -c jfs_tune -l /dev/sda10
  3.930643 +0.003847 execute_command()              exit status 0
  3.931600 +0.000957 execute_command()              execute: sh -c nilfs-tune -l /dev/sda12
  3.950734 +0.019134 execute_command()              exit status 0
  3.962121 +0.011387 execute_command()              execute: sh -c debugreiserfs /dev/sda14
  4.013574 +0.051453 execute_command()              exit status 0
  4.014161 +0.000587 execute_command()              execute: sh -c xfs_db -r -c 'label' /dev/sda15
  4.073133 +0.058972 execute_command()              exit status 0
  4.073622 +0.000489 execute_command()              execute: sh -c btrfs filesystem show /dev/sda6
  4.079587 +0.005965 execute_command()              exit status 0
  4.079743 +0.000157 execute_command()              execute: sh -c dumpe2fs -h /dev/sda7
  4.095954 +0.016211 execute_command()              exit status 0
  4.096129 +0.000175 execute_command()              execute: sh -c dumpe2fs -h /dev/sda8
  4.119581 +0.023452 execute_command()              exit status 0
  4.119779 +0.000198 execute_command()              execute: sh -c dumpe2fs -h /dev/sda9
  4.147974 +0.028195 execute_command()              exit status 0
  4.148176 +0.000202 execute_command()              execute: sh -c echo dm | jfs_debugfs /dev/sda10
  4.162963 +0.014787 execute_command()              exit status 0
  4.163078 +0.000115 execute_command()              execute: sh -c nilfs-tune -l /dev/sda12
  4.212579 +0.049500 execute_command()              exit status 0
  4.212791 +0.000212 execute_command()              execute: sh -c debugreiserfs /dev/sda14
  4.216071 +0.003281 execute_command()              exit status 0
  4.216204 +0.000133 execute_command()              execute: sh -c xfs_db -c 'sb 0' -c 'print blocksize' -c 'print dblocks' -c 'print fdblocks' -r /dev/sda15
  4.219917 +0.003713 execute_command()              exit status 0
  5.238062 +1.018145 execute_command()              execute: sh -c udevadm settle --timeout=1
  6.241096 +1.003034 execute_command()              exit status 256
  6.241345 +0.000249 set_devices()                  end

Thanks,
Mike

2

Re: GParted device refresh speed

Hi Mike,

This is certainly another area that could benefit from some attention.  See:
Bug 631959 - Gparted wasting LOTS of time REPEATEDLY searching partitions after performing function

mfleetwo wrote:

Some observations of my data:

  • A few external commands are very time consuming:

    • 2.1s (mdadm) scanning for software raid.

    • 1.0s (udevadm) udev event waiting.

    • 0.7s (lvm) 3 commands scanning LVM.

  • Many of the file system specific query commands are executed twice on unmounted file systems, once for the sector usage and once for the label.

  • There is another second which passes before the execution of udevadm which isn't yet identified.

Perhaps this extra second is from the work around in GParted_Core::commit_to_os when compiled with libparted < 2.2.0?

mfleetwo wrote:

Few thoughts on ways to mitigate this:

  • Have a way to identify what to rescan depending on what has changed.  For example mount/umount operations only change the busy status of one file system, yet perform a full rescan.

    • Only execute hardware and software raid scanning and udev waiting on full Refresh Devices, not after completing every operation.

Tracking when the data becomes "dirty" sounds like a good approach to me.  There are some gotcha's we will have to remember, such as when a logical partition is mounted/unmounted because it might change the status of the extended partition active/inactive.

mfleetwo wrote:
  • Keep [Ctrl-R] Refresh Devices performing a full rescan to mitigate missing changes performed simultaneously outside GParted, or bugs not scanning for changes initiated in GParted.

I agree with keeping this a full scan for the reasons outlined.

mfleetwo wrote:
  • Within a single rescan, cache the output of all commands used to query information so subsequent executions can just read the cache.

  • Change ext2/3/4, and possibly other file systems too, so that they use the same command to query sectors and label to make better use of the command output cache.

With these last two points it might make sense to have a file system object, or perhaps a more general partition contents object to handle more than just file systems (e.g., LVM).  A clean/dirty flag could then be used so that the object would reload the relevant information when needed.
Alternatively we might be able to modify the Partition object to handle this too. 

I think the caching and dirty status tracking of this information is likely one of the best ways to improve scan performance.

Another way I have seen with other tools is to only scan the relevant partition and file system information when the user clicks on the device in the GUI.  That way all devices are not scanned at start-up.  This would spread out the scan time for persons with more than one device, but would not likely be of much benefit to a majority of users with a single disk device.

3 (edited by mfleetwo 2012-10-03 17:07:39)

Re: GParted device refresh speed

gedakc wrote:

Perhaps this extra second is from the work around in GParted_Core::commit_to_os when compiled with libparted < 2.2.0?

I have put extra timing calls into commit_to_os().  The extra second not accounted for from execution of external commands is from the first call to ped_disk_commit_to_os() in GParted_Core::commit_to_os().

  ...
  3.440272 +0.000106 commit_to_os()                 timeout=1
  3.440311 +0.000039 commit_to_os()                 Calling is_dmraid_device("/dev/sda") ...
  3.440317 +0.000006 commit_to_os()                 Calling ped_disk_commit_to_os() ...
  4.462572 +1.022255 commit_to_os()                 Returned from ped_disk_commit_to_os()
  4.462763 +0.000190 execute_command()              execute: sh -c udevadm settle --timeout=1
  5.466235 +1.003472 execute_command()              exit status 0
  5.466299 +0.000064 commit_to_os()                 return true
  5.466514 +0.000215 set_devices()                  end

From my limited understanding of the libparted API, ped_disk_commit_to_os() is only needed after the partitions are changed and not otherwise.
    int ped_disk_commit_to_os ( PedDisk * disk )

4

Re: GParted device refresh speed

mfleetwo wrote:

From my limited understanding of the libparted API, ped_disk_commit_to_os() is only needed after the partitions are changed and not otherwise.
    int ped_disk_commit_to_os ( PedDisk * disk )

Your understanding is the same as mine.

Recently I noticed that deleting (clearing) a FAT16/32 volume label is not reflected in the blkid cache.  See Bug 684403 - GParted appends 'nA' to label when changing label.

I wonder if there is a way to indicate that items such as volume labels or UUIDs have changed when there has been no corresponding partition boundary change?

5 (edited by mfleetwo 2012-10-04 14:05:16)

Re: GParted device refresh speed

gedakc wrote:

Recently I noticed that deleting (clearing) a FAT16/32 volume label is not reflected in the blkid cache.  See Bug 684403 - GParted appends 'nA' to label when changing label.

I wonder if there is a way to indicate that items such as volume labels or UUIDs have changed when there has been no corresponding partition boundary change?

(I think this is what you are asking ...)

Blkid command primarily reads from its cache.  To force a scan of the hard drive use "blkid -c /dev/null", but this can't update the cache file though.  When relabelling ext2/3/4 using "tune2fs -L ..." it also updates blkid's cache.  When relabelling fat16/32 using "mlabel ..." it doesn't update blkid's cache, so the blkid command doesn't (always) reflect label changes.  Determined by using strace rather than reading any code.

I think that GParted should just be changed to use "blkid -c /dev/null" in FS_Info::load_fs_info_cache() to read the UUIDs and labels.  It will add an extra second or so to every refresh.

6 (edited by gedakc 2012-10-04 18:41:05)

Re: GParted device refresh speed

mfleetwo wrote:
gedakc wrote:

I wonder if there is a way to indicate that items such as volume labels or UUIDs have changed when there has been no corresponding partition boundary change?

(I think this is what you are asking ...)

Blkid command primarily reads from its cache.  To force a scan of the hard drive use "blkid -c /dev/null", but this can't update the cache file though.  When relabelling ext2/3/4 using "tune2fs -L ..." it also updates blkid's cache.  When relabelling fat16/32 using "mlabel ..." it doesn't update blkid's cache, so the blkid command doesn't (always) reflect label changes.  Determined by using strace rather than reading any code.

Thank you for the explanation.  Perhaps there is a command or ioctl that could be called in GParted to indicate that the volume label or UUID changed in FAT16/32 file system?

mfleetwo wrote:

I think that GParted should just be changed to use "blkid -c /dev/null" in FS_Info::load_fs_info_cache() to read the UUIDs and labels.  It will add an extra second or so to every refresh.

GParted previously used "blkid -c /dev/null".  Unfortunately blkid appears to have problems when the BIOS is misconfigured to indicate a floppy device is present when there is no physical floppy drive installed.  The problem is that blkid hangs for several minutes.  To work around this we changed blkid to use cached results.

The relevant bug report is:
Bug 667511 - Gparted does not start and continues to scan devices

It looks like this change has the undesirable effect of not reflecting volume label and UUID updates to FAT16/32 file systems.  ;-(

EDIT:
This problem does not appear to be fully resolved yet.
See Ubuntu Launchpad Bug 1029149 - gparted automatically starts scanning all devices slowly and never completing the task

Perhaps we need to consider using "blkid -p /path-to-partition" for each partition?
My concern would be that this would add several more seconds to the device refresh time.

EDIT2:
The original intention of using blkid was to be able to capture volume label and UUID information for file systems that do not have a command line equivalent for retrieving this information.  To speed things up, GParted used to rely on the blkid information only, providing that it was not empty.  If empty GParted would try the file system specific command.  This did speed refresh up considerably.  The reason blkid was changed to not be the default was due to UTF-8 language encoding in volume labels.  See Bug 662537 - Ext4 unicode labels not shown correctly.

7 (edited by mfleetwo 2012-10-06 16:40:30)

Re: GParted device refresh speed

Blank label handling in both GParted and blkid is broken.  Use GParted to create any file system and set the label to something.  All good.  Then try clearing the label.  After refresh the previous label is displayed.  Wrong.  Use blkid to display the label, the previous label is displayed.  Wrong.  Use FS specific command, blank label is displayed.  Correct.  True for ever file system I tried: btrfs, ext2/3/4, fat16/32, jfs, nilfs2, reiserfs and xfs.

GParted reads the label using the file system specific method and correctly sets the label to a zero length string.  Then GParted_Core::set_device_partitions() decides that the label hasn't been set and calls FS_Info::get_label() to read the label from the cache of blkid output.  It gets the previous value. 

For a file system label GParted needs to distinguish between the zero length label and the unknown or unset label.

8

Re: GParted device refresh speed

mfleetwo wrote:

Blank label handling in both GParted and blkid is broken.
<stuff deleted>
For a file system label GParted needs to distinguish between the zero length label and the unknown or unset label.

Good catch Mike.  If blkid did work correctly, then I think GParted should also work correctly.  Having said that GParted does contain code that works around other tool deficiencies.

I noticed that you reported the blkid blank label handling problem on the util-linux mailing list.
bug: blkid doesn't notice when fs labels change to blank

9 (edited by mfleetwo 2012-10-06 19:19:23)

Re: GParted device refresh speed

I'll code a fix for GParted.  Thinking of changing partition.label from a string to a point to a string so NULL can be used to mean unknown and partition->label = "" is a blank label.  I don't think that you should hold the release of GParted 0.14 for this fix because the bug doesn't cause any harm and has existed for a long time so appears to be rarely encountered by users.

10

Re: GParted device refresh speed

Thanks Mike for looking in to this issue.  And I agree that we should not hold up the 0.14.0 release since this bug is minor.

11

Re: GParted device refresh speed

Raised:
Bug 685656 - GParted doesn't notice when file system label is changed to blank

12

Re: GParted device refresh speed

Thanks for raising the bug report Mike.   Though this is a minor issue in that no data is lost, this issue is definitely one that would be nice to have fixed.

13

Re: GParted device refresh speed

It appears that newer versions of blkid might support UTF-8 characters in the label without encoding these in the output.

From the blkid man page in Ubuntu 12.04:

-d    Don't encode non-printing characters.  The non-printing charact-
      ters are encoded by ^ and M- notation by default. Note  that  -o
      udev  output  format  uses a diffrent encoding and this encoding
      cannot be disabled.

If we could use blkid to scan for all labels and UUIDs, then that would save time needed for individual tool scans for this information.  It's just a thought at this time.  I haven't looked at your patch yet.

14

Re: GParted device refresh speed

Unfortunately while blkid doesn't recognise a label being changed to blank, using blkid first to read label will give GParted the same issue, undoing anything fixed by Bug 685656 - GParted doesn't notice when file system label is changed to blank.

At the moment i'm slowly going through and understanding how GParted loads all the device, partition and file system information and why it does it the way it does so I can decide how to change it.

15

Re: GParted device refresh speed

Another thought is to try to work around the blkid limitation by somehow forcing blkid to update it's cache for the partition when the partition label is cleared.  So far I haven't discovered a blkid parameter that would do this though.

16

Re: GParted device refresh speed

Back to refresh speed ...

Finally worked out what the purpose of the code is that uses all the time identified from the timings in comment #3 above.  The calling code is:

src/GParted_Core.cc - void GParted_Core::set_devices( std::vector<Device> & devices )
   297                                  if ( temp_device .highest_busy )
   298                                  {
   299                                          temp_device .readonly = ! commit_to_os( lp_disk, 1 ) ;
   300                                          //Clear libparted messages.  Typically these are:
   301                                          //  The kernel was unable to re-read the partition table...
   302                                          libparted_messages .clear() ;
   303                                  }

The call chain look like this:

commit_to_os(timeout=1)
    ped_disk_commit_to_os()    // takes ~ 1 second
    settle_device(timeout=1)
        Utils::execute_command("udevadm settle --timeout=1")    // takes up to 1 second first time and typically milliseconds thereafter

During every refresh cycle, for every disk the above code is executed.  It is performed to determine if the kernel can be informed of on disk partition changes when there are other partitions on the disk which are active; especially given the comment on line 301.  Example error from fdisk when the kernel can't be informed:

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

If GParted determines that the kernel can't be informed of partition changes it disables some operations which it considers dangerous.  Those operations appear to be: resize/move partition, copy paste partition into empty space and creating a new partition other than unformatted (that last one seems a bit strange).

Originally the linux kernel could only re-read the whole of the partition table using the BLKRRPART ioctl.  (That's all fdisk and sfdisk still issue now).  This fails with EBUSY (Device or resource busy) when any partition is active.  Newer kernels and tools can also use BLKPG_ADD_PARTITION and BLKPG_DEL_PARTITION ioctls inform the kernel of changes to non-active partitions.  (Guess this includes at least linux >= 2.6 and (lib)parted >= 2.0).

Useful answer in parted-devel email thread of yours
    Warning message if create partition when a partition is mounted

I don't see any reason why we can't get rid of this per device readonly flag and have a single program wide setting which is only ever tested once when scanning the first device for the first time.

(This is the first of several ideas I have for how to address refresh speed).

EDIT: Unfortunately it's not that simple from a program point of view.  Active partitions are under the control of the users.  There may be no active partitions when GParted is started.  The user may then make a partition active and because they are using older software it may not be possible to inform the kernel of the partition changes.  Need to check out exactly which versions of the kernel and libparted are affected and decide if we need to support them or the code needs to be more subtle.

17

Re: GParted device refresh speed

The call to ped_disk_commit_to_os() is important as I recall.

Back in 2009 many of our users experienced problems resizing partitions when GParted failed to inform the kernel of partition changes.  Unfortunately the failure to inform occurred after changing the partition boundaries, but not before the file system was maximized to fill what the kernel thought were the partition boundaries.
See WARNING! Problem Resizing File Systems with GParted

Since this time the GParted code has been improved to better deal with this situation, and also the libparted code has improved.


There is probably room for improvement here, but I do recall many problems when various versions of libparted were mixed with different kernel versions.

18

Re: GParted device refresh speed

I looked through all the links and there seemed to have been these two separate bugs with GParted / (lib)parted failing to informing the kernel of partition changes after they were applied on the disk.  Both were fixed by applying patches to (lib)parted and incorporating those into GParted live image.  Were there any fixes applied to GParted app?  Can you point them out to me?

Bug reports:

  1. Bug 601574 - ERROR: Current NTFS volume size is bigger than the device size!

  2. Bug 604298 - Problems resizing file systems with gparted-live-0.5.0-3

Commits (parted upstream URLs):

  1. parted: avoid unnecessary open/close on commit, and thus udev activity

  2. linux: add wait time and retries to kernel partition reread

When GParted applies each partition add, partition remove or partition resize operation it calls commit_to_os() to inform the kernel of the change, which I will not be changing.  I don't think the speed of applying this is a problem.

However during refresh of the display after all the operations have been applied it calls commit_to_os() again in set_devices() and takes 1 to 2 seconds for each disk which has at least one active partition to discover if GParted will be able to inform the kernel of partition changes in the future.  If not GParted disables some functionality: resize/move partition, copy paste partition into empty space and creating a new partition other than unformatted.  This is what I want to address.

Assuming these extra calls to commit_to_os() during the refresh don't make it work, they are independant and can be reworked.

19

Re: GParted device refresh speed

The patches in libparted helped to alleviate the problem, but did not absolutely prevent the problem from happening.

mfleetwo wrote:

Were there any fixes applied to GParted app?  Can you point them out to me?

In GParted we made the following code changes in an effort to reduce the chance of the problem "ERROR: Current NTFS volume size is bigger than the device size!" occuring:

GParted Commits:
Add check if partition table re-read work around code is needed
Avoid redundant file system maximize actions (#663980)

mfleetwo wrote:

Assuming these extra calls to commit_to_os() during the refresh don't make it work, they are independant and can be reworked.

You make a good point Mike.  These extra calls to commit_to_os() during the refresh did not prevent the problem that occurred.  I can only guess that these calls might detect if the disk is read-only.  Otherwise what purpose are the calls serving?