Friday, June 7, 2013

[Project Euler] Problem 3a in C (Wrong)

Problem: Project Euler Problem 3

This one is as easy as previous problems.
However, since it has been a long time since I have coded in C or in fact, any languages, I wanted to solve this problem incorporating some file handling, and linked list.
It is a weird choice and probably the most inefficient way of solving this.

The result I have gotten was incorrect because of how integer artithmetics in __int64 types.
The correct answer can be found from here: [Project Euler] Problem 3

How Sieve of Eratosthenes is Created

File Handling

I implemented sieve of Eratosthenes (for more details, click the link) in C and have saved them into a file.
The number I am working with is a not a huge number 600851475143, so using sieve of Eratosthenes is probably one of the slowest methods available.

However, for practice,  I have used this method.

In the file, only the prime numbers and the last number the code has executed before were stored.
This was to eliminate computing power required to reprocess some numbers which have been determined to be prime or not.
So, if the computer has created sieve of Eratosthenes before up to the number 15.
The file would appear like:
2 3 5 7 11 13 15
 Each number would be separated by a space (' ') with the last number representing the number it has calculated up until.

Linked List

Speed

Unlike the algorithm presented in the gif image in the Wikipedia page of sieve of Eratosthenes, I have decided to delete values which have already been determined to be a multiple of a prime number. In other words, if there was 100,000 values to work with, instead of checking nearly 100,000 values for each prime number found, the number of values checked would decrease.

Memory

In terms of memory cost during execution time, using linked list would be less than array as well. Since an array uses the memory for the entire duration of execution. However, this statement would only be true if there are

However, for such a small number, this would not matter as much. However, for something like 50 digit number, it will matter.
The primary reason for using linked list is again, for my practice.

Optimizations

I. We only need to check prime values up to square root of a number to find the largest prime factor of number

I believe this statement is wrong but I have implemented this into my code. I obscurely remembered the statement and sadly, I remembered incorrectly. So, after writing the entire code, I have tried to confirm it on the internet but I was wrong! However, this was not the reason why I was left with an incorrect answer.
Click to see the correct statement (Go to Trial division section)

II. We only need to check for multiples of values up to half of the maximum value in sieve of Eratosthenes

The above statement is true. Sieve of Eratosthenes uses the concept of multiples of a prime number to eliminate any non-prime numbers since any numbers can be factored into only the prime numbers. So, when processing sieve of Eratosthenes, I have checked for multiples only until values less than (maxVal / 2 + 1).

As shown in while (linkList -> curr != NULL && linkList->curr->value < num / 2 + 1)

III. We can eliminate even numbers to check except for 2

For obvious reasons, 2 (prime number) is not eliminated from sieve of Eratosthenes. Then we know multiples of 2 are even numbers, so we can immediately even numbers in our sieve of Eratosthenes to halve the execution time.

As shown in +=2 in code

Code

I had three files:
main.c


linkedList.c


linkedList.h

Reason behind Incorrect Answer

Despite all the loopholes, the reason behind the incorrect answer was surprisingly due to integer arithmetics of __int64 types. The answer the code spat out was 486847 which is much larger than the correct answer. 
This was because 600851475143 % 486847 = 0 instead of 20.
This miscalculation has caused 486847 to be the largest prime factor of 600851475143.

The moment I found out the problem, I dropped everything because I felt I had enough practice on link lists and file handling and it was time to move on.

Loop Holes

  • the number to create sieve of Eratosthenes up until must be greater than 2 or it will fail the next time it executes while file exists
  • the code has segmentation faults when the file exists but is empty
  • after successfully outputting sieve of Eratosthenes to file, when the program is run again, it correctly retrieves the data but outputs the results incorrectly (This has been fixed by opening the file with "w+" mode despite the fact that the file needs to be rewritten from the start, slowing the execution time as a result.)

No comments:

Post a Comment