There seem to be two bugs when extracting libraries from the shared cache, both regarding segment/section rebasing:
- Sections in the __TEXT segment seem to get ignored when rebasing.
- Rebasing appears to break sections that aren't mapped from the file (usually zero fills).
As an example, the WebKit framework extracted from an iPhone6,2's 9.3.1 arm64 cache:
Extracted with jtool:
- Code: Select all
LC 00: LC_SEGMENT_64 Mem: 0x188794000-0x188a7c000 File: 0x0-0x2e8000 r-x/r-x __TEXT
Mem: 0x188795a18-0x188a0cec8 File: 0x08795a18-0x08a0cec8 __TEXT.__text (Normal)
Mem: 0x188a0cec8-0x188a12cdc File: 0x08a0cec8-0x08a12cdc __TEXT.__stubs (Symbol Stubs)
Mem: 0x188a12cdc-0x188a18ad8 File: 0x08a12cdc-0x08a18ad8 __TEXT.__stub_helper (Normal)
Mem: 0x188a18ad8-0x188a2b30c File: 0x08a18ad8-0x08a2b30c __TEXT.__gcc_except_tab
Mem: 0x188a2b30c-0x188a3bc29 File: 0x08a2b30c-0x08a3bc29 __TEXT.__objc_methname (C-String Literals)
Mem: 0x188a3bc29-0x188a3cb40 File: 0x08a3bc29-0x08a3cb40 __TEXT.__objc_classname (C-String Literals)
Mem: 0x188a3cb40-0x188a629fb File: 0x08a3cb40-0x08a629fb __TEXT.__objc_methtype (C-String Literals)
Mem: 0x188a629fb-0x188a75a3e File: 0x08a629fb-0x08a75a3e __TEXT.__cstring (C-String Literals)
Mem: 0x188a75a40-0x188a75f24 File: 0x08a75a40-0x08a75f24 __TEXT.__const
Mem: 0x188a75f24-0x188a75f77 File: 0x08a75f24-0x08a75f77 __TEXT.__os_activity
Mem: 0x188a75f77-0x188a76380 File: 0x08a75f77-0x08a76380 __TEXT.__dof_WebKitMes (DTrace DOFs)
Mem: 0x188a76380-0x188a7c000 File: 0x08a76380-0x08a7c000 __TEXT.__unwind_info
LC 01: LC_SEGMENT_64 Mem: 0x19df17000-0x19df22d40 File: 0x2e8000-0x2f3d40 rw-/rw- __DATA
Mem: 0x19df17000-0x19df17020 File: 0x002e8000-0x002e8020 __DATA.__la_weak_ptr (Lazy Symbol Ptrs)
Mem: 0x19df17020-0x19df1a178 File: 0x002e8020-0x002eb178 __DATA.__objc_selrefs (Literal Pointers)
Mem: 0x19df1a178-0x19df1a190 File: 0x002eb178-0x002eb190 __DATA.__objc_protorefs
Mem: 0x19df1a190-0x19df1a910 File: 0x002eb190-0x002eb910 __DATA.__objc_classrefs (Normal)
Mem: 0x19df1a910-0x19df1ac58 File: 0x002eb910-0x002ebc58 __DATA.__objc_superrefs (Normal)
Mem: 0x19df1ac58-0x19df1b2c8 File: 0x002ebc58-0x002ec2c8 __DATA.__objc_ivar
Mem: 0x19df1b2c8-0x19df1d9d8 File: 0x002ec2c8-0x002ee9d8 __DATA.__objc_data
Mem: 0x19df1d9d8-0x19df210b0 File: 0x002ee9d8-0x002f20b0 __DATA.__data
Mem: 0x19df210b0-0x19df210c0 File: 0xe43d1000-0xe43d1010 __DATA.__common (Zero Fill)
Mem: 0x19df210c0-0x19df22d40 File: 0xe43d1000-0xe43d2c80 __DATA.__bss (Zero Fill)
LC 02: LC_SEGMENT_64 Mem: 0x19ea24408-0x19ea244a8 File: 0x2f3d40-0x2f3de0 rw-/rw- __DATA_DIRTY
Mem: 0x19ea24408-0x19ea244a8 File: 0x002f3d40-0x002f3de0 __DATA_DIRTY.__objc_data
LC 03: LC_SEGMENT_64 Mem: 0x19b11a000-0x19b149048 File: 0x2f3de0-0x322e28 rw-/rw- __DATA_CONST
Mem: 0x19b11a000-0x19b11a738 File: 0x002f3de0-0x002f4518 __DATA_CONST.__got (Non-Lazy Symbol Ptrs)
Mem: 0x19b11a738-0x19b11e5d0 File: 0x002f4518-0x002f83b0 __DATA_CONST.__la_symbol_ptr (Lazy Symbol Ptrs)
Mem: 0x19b11e5d0-0x19b12cf60 File: 0x002f83b0-0x00306d40 __DATA_CONST.__const
Mem: 0x19b12cf60-0x19b12f3e0 File: 0x00306d40-0x003091c0 __DATA_CONST.__cfstring
Mem: 0x19b12f3e0-0x19b12f7d8 File: 0x003091c0-0x003095b8 __DATA_CONST.__objc_classlist
Mem: 0x19b12f7d8-0x19b12f8c8 File: 0x003095b8-0x003096a8 __DATA_CONST.__objc_catlist (Normal)
Mem: 0x19b12f8c8-0x19b12fa30 File: 0x003096a8-0x00309810 __DATA_CONST.__objc_protolist
Mem: 0x19b12fa30-0x19b12fa38 File: 0x00309810-0x00309818 __DATA_CONST.__objc_imageinfo
Mem: 0x19b12fa38-0x19b149048 File: 0x00309818-0x00322e28 __DATA_CONST.__objc_const
LC 04: LC_SEGMENT_64 Mem: 0x1a0e12000-0x1a44a4000 File: 0x322e28-0x39b4e28 r--/r-- __LINKEDIT
...
Versus dd'ed out of the shared cache:
- Code: Select all
LC 00: LC_SEGMENT_64 Mem: 0x188794000-0x188a7c000 File: 0x0-0x2e8000 r-x/r-x __TEXT
Mem: 0x188795a18-0x188a0cec8 File: 0x08795a18-0x08a0cec8 __TEXT.__text (Normal)
Mem: 0x188a0cec8-0x188a12cdc File: 0x08a0cec8-0x08a12cdc __TEXT.__stubs (Symbol Stubs)
Mem: 0x188a12cdc-0x188a18ad8 File: 0x08a12cdc-0x08a18ad8 __TEXT.__stub_helper (Normal)
Mem: 0x188a18ad8-0x188a2b30c File: 0x08a18ad8-0x08a2b30c __TEXT.__gcc_except_tab
Mem: 0x188a2b30c-0x188a3bc29 File: 0x08a2b30c-0x08a3bc29 __TEXT.__objc_methname (C-String Literals)
Mem: 0x188a3bc29-0x188a3cb40 File: 0x08a3bc29-0x08a3cb40 __TEXT.__objc_classname (C-String Literals)
Mem: 0x188a3cb40-0x188a629fb File: 0x08a3cb40-0x08a629fb __TEXT.__objc_methtype (C-String Literals)
Mem: 0x188a629fb-0x188a75a3e File: 0x08a629fb-0x08a75a3e __TEXT.__cstring (C-String Literals)
Mem: 0x188a75a40-0x188a75f24 File: 0x08a75a40-0x08a75f24 __TEXT.__const
Mem: 0x188a75f24-0x188a75f77 File: 0x08a75f24-0x08a75f77 __TEXT.__os_activity
Mem: 0x188a75f77-0x188a76380 File: 0x08a75f77-0x08a76380 __TEXT.__dof_WebKitMes (DTrace DOFs)
Mem: 0x188a76380-0x188a7c000 File: 0x08a76380-0x08a7c000 __TEXT.__unwind_info
LC 01: LC_SEGMENT_64 Mem: 0x19df17000-0x19df22d40 File: 0x1bf17000-0x1bf22d40 rw-/rw- __DATA
Mem: 0x19df17000-0x19df17020 File: 0x1bf17000-0x1bf17020 __DATA.__la_weak_ptr (Lazy Symbol Ptrs)
Mem: 0x19df17020-0x19df1a178 File: 0x1bf17020-0x1bf1a178 __DATA.__objc_selrefs (Literal Pointers)
Mem: 0x19df1a178-0x19df1a190 File: 0x1bf1a178-0x1bf1a190 __DATA.__objc_protorefs
Mem: 0x19df1a190-0x19df1a910 File: 0x1bf1a190-0x1bf1a910 __DATA.__objc_classrefs (Normal)
Mem: 0x19df1a910-0x19df1ac58 File: 0x1bf1a910-0x1bf1ac58 __DATA.__objc_superrefs (Normal)
Mem: 0x19df1ac58-0x19df1b2c8 File: 0x1bf1ac58-0x1bf1b2c8 __DATA.__objc_ivar
Mem: 0x19df1b2c8-0x19df1d9d8 File: 0x1bf1b2c8-0x1bf1d9d8 __DATA.__objc_data
Mem: 0x19df1d9d8-0x19df210b0 File: 0x1bf1d9d8-0x1bf210b0 __DATA.__data
Mem: 0x19df210b0-0x19df210c0 Not mapped to file __DATA.__common (Zero Fill)
Mem: 0x19df210c0-0x19df22d40 Not mapped to file __DATA.__bss (Zero Fill)
LC 02: LC_SEGMENT_64 Mem: 0x19ea24408-0x19ea244a8 File: 0x1ca24408-0x1ca244a8 rw-/rw- __DATA_DIRTY
Mem: 0x19ea24408-0x19ea244a8 File: 0x1ca24408-0x1ca244a8 __DATA_DIRTY.__objc_data
LC 03: LC_SEGMENT_64 Mem: 0x19b11a000-0x19b149048 File: 0x1911a000-0x19149048 rw-/rw- __DATA_CONST
Mem: 0x19b11a000-0x19b11a738 File: 0x1911a000-0x1911a738 __DATA_CONST.__got (Non-Lazy Symbol Ptrs)
Mem: 0x19b11a738-0x19b11e5d0 File: 0x1911a738-0x1911e5d0 __DATA_CONST.__la_symbol_ptr (Lazy Symbol Ptrs)
Mem: 0x19b11e5d0-0x19b12cf60 File: 0x1911e5d0-0x1912cf60 __DATA_CONST.__const
Mem: 0x19b12cf60-0x19b12f3e0 File: 0x1912cf60-0x1912f3e0 __DATA_CONST.__cfstring
Mem: 0x19b12f3e0-0x19b12f7d8 File: 0x1912f3e0-0x1912f7d8 __DATA_CONST.__objc_classlist
Mem: 0x19b12f7d8-0x19b12f8c8 File: 0x1912f7d8-0x1912f8c8 __DATA_CONST.__objc_catlist (Normal)
Mem: 0x19b12f8c8-0x19b12fa30 File: 0x1912f8c8-0x1912fa30 __DATA_CONST.__objc_protolist
Mem: 0x19b12fa30-0x19b12fa38 File: 0x1912fa30-0x1912fa38 __DATA_CONST.__objc_imageinfo
Mem: 0x19b12fa38-0x19b149048 File: 0x1912fa38-0x19149048 __DATA_CONST.__objc_const
LC 04: LC_SEGMENT_64 Mem: 0x1a0e12000-0x1a44a4000 File: 0x1ce12000-0x204a4000 r--/r-- __LINKEDIT
...
Apart from that, the __LINKEDIT segment of extracted libs still takes up a lot of space (due to merging, as you explained in one of your articles).
I assume each dylib from the shared cache is only affected by a small percentage of the commands in __LINKEDIT, right? If so, would it be possible to strip those commands that don't affect it?
It also seems that the __LINKEDIT segment is
badly fragmented (jtool --pages WebKit):
- Code: Select all
...
0x322e28-0x39b4e28 __LINKEDIT
0x3241a0-0x3241f8 Weak Bind Info (opcodes)
0x62e0a8-0x637fe0 Exports
0xcca840-0xcd0bd8 Binding Info (opcodes)
0x1030f70-0x104d008 Lazy Bind Info (opcodes)
0x1aebca0-0x1b424b0 Symbol Table
0x2b8ae98-0x2b913c8 Function Starts
0x2cabbf0-0x2cabbf0 Data In Code
0x2d0de54-0x2d120a8 Indirect Symbol Table
0x2ddaf18-0x398b1db String Table
...
Am I missing something, or are there really 17MB between the symbol table and function starts, that are just unused?
And finally, something very minor: the --pages option might show pages in the wrong order, if zero-sized pages are involved:
- Code: Select all
...
0xc000-0xee60 __LINKEDIT
0xc000-0xc008 Rebase Info (opcodes)
0xc008-0xc040 Binding Info (opcodes)
0xc040-0xc130 Lazy Bind Info (opcodes)
0xc130-0xc198 Exports
0xc198-0xc1a8 Function Starts
0xc1a8-0xc2f8 Symbol Table # <--
0xc1a8-0xc1a8 Data In Code # <--
0xc2f8-0xc370 Indirect Symbol Table
0xc370-0xc484 String Table
0xc490-0xee60 Code signature
(Taken from the latest
kdump if you need something to test it on.)