Saturday, March 24, 2012

3/19 - 3/23 CS373 Blog Post

Good evening all!

What a busy week it has been! Dealing with the project and then preparing for an exam shortly thereafter was not a very fun turn-around.  But the week is over, and I guess that might as well be the only good to have come out of this week!

This past week, my team and I worked our butts off on project 4: World Crises, Phase 1.  This website is a collection of crises, people, and organizations that all will eventually be aggregated through the use of a Google App Engine (GAE).  For this phase of the project, all we had to do was to collect data on our items and display them in static pages.  While we were working on this, we were also supposed to get the back-end side to be able to import XML files given an XML schema, and to be able to export data stuffed into the datastore into an XML format.  The static pages and collecting data weren't very difficult; our importer and exporter took some time to develop, though.  We learned that such scripts are very nit-picky and like to take things in at a certain order.  When we successfully figured out what order we wanted things in, we had to make our exporter export the material in the same pattern.  It was more of a nuisance than anything, but we have things working!  Things are linked in the datastore to prevent duplicates via a many-to-many relationship.  This basically means that certain items may share a certain attribute one or more times.  It eliminates redundant information that is duplicated, because no one wants that, you know?  Anyway, the project seems to be not too bad, and our team is pretty awesome right now.  I have pretty positive vibes with everyone and we gel pretty well! It's gonna be really promising when we complete this project.

As far as the test was concerned, it was a bit difficult for me.  I thought I knew the material fairly well, but there was some questions I wasn't prepared for.  As Downing had mentioned in class about being prepared for the programming questions, I hadn't really prepared for them, so I suffered from that aspect of the exam.  I also didn't do as well as I had hoped on the multiple choice aspect.  My Z score for the exam is still positive, but I felt like I could have done a lot better job.  But, things like that happen and sometimes you don't always do the best all the time.  So for the next exam, I'll try to compensate for my low grade!

I'm really wondering what's in store for the next part of the project.  My thoughts are that we are going to make one Python page that regulates what will be displayed depending on what the user wants to view.  We'll also probably incorporate the linking between the GAE datastore and the website to allow for other pieces of data to be displayed on the website.

Well that's it for now, I need to catch up on sleep.

Until next time,

Corey

Tuesday, March 13, 2012

3/5 - 3/9 CS373 Blog Post

Good evening everybody.

At last, we are finally in Spring Break! Although it is great to not have to deal with school for a full week, it is kind of unfortunate that we have to work on a project.  Let alone, we have to work on this project with 5 other people, hoping that all of our schedules will match up properly.

The project seems like it'll be an interesting project to work on, but one that will have to be worked on very slowly since we are in groups with more than 2 people.  Since we all (or at least most of us) haven't had experience using the Google App Engine or ElementTree, we'll have to research and work on it together in order to be fluent with it.  I've been taking a look at it myself for a little bit, and while it doesn't seem too bad, we definitely will have to do some decent collaboration in order to make any headway into the project.  So far, it seems like all of us will be available within the second half of spring break; more notably the weekend following. We'll have a lot of cramming to do in order to get this project done by next Wednesday! On top of that, we have an exam next Friday.  What seemed like a nice vacation now just seems like a lot of work being pushed off until next week!

In class last week, we discussed the topic of overloading functions and overloading generics.  I had basic knowledge of how that was used, and it seems like most of us use it subconsciously when writing out functions.  I tend to design functions to support a decent amount of data structures.  Maybe not every single kind, because then that just means I may have to re-consider the complete structure of my program.  In Java, it's really a pain because of how *gross* the syntax really looks.  Compared to Python, it is rather disgusting, but I suppose necessary if we want functions to be supported by a variety of data types.  I mean, comparing

private static <T extends Comparable<? super T>> T max2 (T x, T y)


to


def my_max (x, y)



is a definite difference. Since Python is typeless too, this helps in dealing with possible types that are to be passed in as parameters.


It'll be an interesting end to the break to see how things progress with the project. We'll see what happens! Hope everyone is enjoying their break.


Until next time,


Corey

Sunday, March 4, 2012

2/27 - 3/2 CS373 Blog Post

Good evening all,

One week away until Spring Break!  Unfortunately it comes with a long and enduring road for me.  I have a project and four midterms that are in my way until then!

This week was mainly about the Netflix project.  I wrote quite an extensive wiki of it (viewed here) detailing all of my algorithms that I tried testing, along with the problem itself.  The basic idea of the Netflix project was to achieve a root mean squared error (RMSE) below Netflix's score, which was around .9474.  The root mean squared error calculated a margin of error between a prediction of ratings (that I would generate based on data) and the actual ratings that users gave movies.  Sifting through roughly 1.5 GB of data, I compiled some useful caches to be used in my program.  The two main caches contained an average rating for each movie, and an average rating for each user.  With these two caches, I was able to predict a movie rating based on what was sent to me in the probe file (this file contained a list of movies whose ratings that needed to be predicted).  This project was not very hard to implement; the hardest part was tweaking the program and looking at what gave a better RMSE.  I had stuck with taking the average movie rating and average user rating and figuring out how to implement these two to come up with a semi-decent prediction.  I ultimately decided to make another cache: one that had the calculations of all movie ratings combined and all user ratings combined.  I had then calculated some offsets based from the average and multiplied a given weight to some of the numbers in order to find a decent rating.  All of this is explained in heavy detail in the wiki, linked above!

We had also talked about some more interesting things about Python, notably how many types inherit a lot of the same ideas.  Strings, tuples, and lists are all able to work on things identically when a string is passed in their constructor.  It's a really interesting idea and one that makes sense.  I like how everything in Python is very succinct for the most part. Although, now I often catch myself writing code in my other classes and I forget to add semi-colons, brackets, and the likes. Honestly, this class is making me look down at Java and look up at Python more and more, every day.  And I don't mind that!

I'm very curious as to how this new project is going to unfold.  I'm excited to see what all we need to do to make a nice looking website, and it'll be fun working with a big group.  It makes me wonder how we'll use Google Code and Python to build an aggregation-based website.

Well, I have lots to do right now, but next week should be a much more extensive blog post.

Until then,

Corey