Saturday, February 18, 2012

2/13-2/17 CS373 Blog Post

Good evening everyone!

It's been a busy week!  The PFD project was due and we had our first exam in the course.  A little stressful, but I think we all survived!

This week, the primary focus was on the Project File Dependencies project.  The basic concept of this project was to take in an input (essentially a graph of nodes that contained a list of dependencies) and output a result where numbers would be placed in a way so that the number before it did not depend on it.  This definitely was a project that could have used several different implementations to get the job done right.  In collaboration with my partner, we ended up choosing the easy route by using a list that would, through iterations, sort itself (with as few calls to sort as possible).  In our program, we started off by reading the input and we read the appropriate amount of lines based on the number we obtained from the first line.  The first number designated how many vertices (or nodes, if you prefer that) are in our graph, while the second number represented how many dependency rules to follow.  We initialized a list filled with empty lists, with the number of empty lists being dependent on the number of vertices that would be in the graph.  We then iterated through the rest of the input to determine the rules.  Each line contained the node number, followed by the number of dependencies it has and the dependency node numbers following that.  The construction of our graph rules ended up appending such data into the graph's corresponding index of the rule's node.  So, for instance, a rule that contained "5 3 1 2 3" would create a list [1, 2, 3] that would fit in graph[4].

Once we filled up our graph, we went down through the list and appended the indices whose dependency lists were empty.  These indices would be appended into an intermediate container (in our case, a sorted list).  Since we would go down our graph every time, the list would be sorted initially.  For every index we found to be empty, we replaced the empty list with a -1 delimiter to signify that it has been "removed" from our graph.  Then, for each node in our intermediate list, we would search through all non-empty dependency lists and remove that node from the dependency list to again signify that it has been removed.  After one run, we add the node to a result list.  We search again and find for any empty lists; if there are, we add them to the intermediate container and sort.  Rinse, repeat, profit.

Through observation, I understand that we could've done a better job in optimizing our code. Every time we add new nodes to our intermediate list, we sort, which can take a while depending on how big the graph is, and how many nodes become free through one removal.  I did learn through research though, that Python's sort() function is an adapted merge sort which runs a lot faster than a traditional merge sort algorithm.  Regardless, had we had used a different kind of container like a linked list or a binary heap,  we could have been more efficient with more optimal code.  Maybe I'll experiment with that later.

In class, we discussed about iteration, parallel assignments, argument keywords, default arguments, unpacking iterable items, and creating tuple and dictionary containers in functions (phew, so many topics!).  I guess it never really hit me, but it makes sense that within a for each loop that items are copied.  It abides to the iterablility (not really a word, but bear with me) standard.  Most of our conversation this week revolved around some interesting quirks with Python.  Parallel assignment was interesting in the sense that one could pass in any iterable object of a given length (designated in the parameters of the function), and it would work.  I guess that's one amazing but potentially dangerous thing about Python; you can pass in just about anything, but it has to make sense.  A list, tuple, set, dictionary, xrange, can all be passed in the example slide with the same value.  Pretty ridiculous.  Passing in arguments as keywords was also a really cool thing to learn about; as Professor Downing had mentioned, this is incredibly useful for defining what your variables are exactly in method calls.  Someone looking at fresh code could see a function call like this:

custom_hash(24, 189, data, "x381jfs")

and would probably have no idea what it meant, other than it uses some kind of data to create some hash.  If instead, it could be re-written as such:

custom_hash(data_begin_index = 24, data_end_index = 189, list_of_data = data, salt = "x381jfs")

one could then decipher that this is a hash function that takes a list that's only interested in a range of the list, and then salts it.  Of course, it is very fragile.  It can be easily broken if one decided that the parameter names were either too ambigious, too long, unnecessary, etc.  Still a useful thing to use.  You can also put the assignments in different places and it would still spit out the right answer, so long as you provide the necessary arguments in the right place for those that aren't assigned explicitly.  Furthermore, you can provide default values for arguments if they aren't passed.  Again, pretty useful if a function call is used several times and a fixed value is used consistently.

We also discussed about unpacking items, as well as creating tuples and dictionaries straight from the parameter line.  Unpacking items was kind of cool.  You could pass in just about any iterable item (like the parallel scheme discussed about above) of a given length, based on the parameter, and it would unpack the data inside to whatever you desire.  That is, it would strip the data from a container to be used for other purposes.  This helps with removing the need to iterate through data and add it to some kind of container.  Alongside of unpacking, we can do the opposite with creating tuples and dictionaries.  Again, useful functions that eliminate those extra lines of codes because programmers can be awfully lazy sometimes.

Lastly, we had a test last Friday.  It didn't turn out as bad as I had expected!  As Professor Downing had said, he geared the test to make the cheat sheet virtually useless, but it did indeed help as a very helpful review over all the material we talked about thus far.  I stumbled a bit on the Haskell problem (I didn't really look into it that much; I had bet that it would've been a multiple choice question!) but I had a rough idea of the syntax.  Hopefully not many points will be deducted from that problem.  All I know is that all of us are anticipating what the final scores will be!

Until next time,

Corey

No comments:

Post a Comment