This one is as easy as previous problems.
However, since it has been a long time since I have coded in C or in fact, any languages, I wanted to solve this problem incorporating some file handling, and linked list.
It is a weird choice and probably the most inefficient way of solving this.
The result I have gotten was incorrect because of how integer artithmetics in __int64 types.
The correct answer can be found from here: [Project Euler] Problem 3
How Sieve of Eratosthenes is Created
File Handling
I implemented sieve of Eratosthenes (for more details, click the link)in C and have saved them into a file.
The number I am working with is a not a huge number 600851475143, so using sieve of Eratosthenes is probably one of the slowest methods available.
However, for practice, I have used this method.
In the file, only the prime numbers and the last number the code has executed before were stored.
This was to eliminate computing power required to reprocess some numbers which have been determined to be prime or not.
So, if the computer has created sieve of Eratosthenes before up to the number 15.
The file would appear like:
2 3 5 7 11 13 15
Each number would be separated by a space (' ') with the last number representing the number it has calculated up until.
Linked List Speed
Unlike the algorithm presented in the gif image in the Wikipedia page of sieve of Eratosthenes, I have decided to delete values which have already been determined to be a multiple of a prime number. In other words, if there was 100,000 values to work with, instead of checking nearly 100,000 values for each prime number found, the number of values checked would decrease.
Memory
In terms of memory cost during execution time, using linked list would be less than array as well. Since an array uses the memory for the entire duration of execution. However, this statement would only be true if there are
However, for such a small number, this would not matter as much. However, for something like 50 digit number, it will matter.
The primary reason for using linked list is again, for my practice.
Optimizations
I. We only need to check prime values up to square root of a number to find the largest prime factor of number
I believe this statement is wrong but I have implemented this into my code. I obscurely remembered the statement and sadly, I remembered incorrectly. So, after writing the entire code, I have tried to confirm it on the internet but I was wrong! However, this was not the reason why I was left with an incorrect answer.
II. We only need to check for multiples of values up to half of the maximum value in sieve of Eratosthenes
The above statement is true. Sieve of Eratosthenes uses the concept of multiples of a prime number to eliminate any non-prime numbers since any numbers can be factored into only the prime numbers. So, when processing sieve of Eratosthenes, I have checked for multiples only until values less than (maxVal / 2 + 1).
As shown in while (linkList -> curr != NULL && linkList->curr->value < num / 2 + 1)
III. We can eliminate even numbers to check except for 2
For obvious reasons, 2 (prime number) is not eliminated from sieve of Eratosthenes. Then we know multiples of 2 are even numbers, so we can immediately even numbers in our sieve of Eratosthenes to halve the execution time.
As shown in +=2 in code
Code
I had three files: main.c
#include "linkedList.h"
#define NUM_LENGTH 15
__int64 initSofE(FILE *fp, int status, __int64 number, LL *linkList1);
__int64 getNextOddNumber(__int64 n);
int insertNode(LL *linkList1, __int64 val);
int deleteNode(LL *linkList1);
void printNumberOfNodes(LL *linkList1);
int processSofE(LL *linkList, __int64 num);
__int64 getGreatestPrimeFactor(LL *linkList, __int64 number);
int main(){
__int64 origNumber = 600851475143LL;
__int64 number = (long long int)sqrt(600851475143LL);
__int64 lastNumberProcessedBefore = 0;
int status = 0; /* 0 = file does not exist, need to initialize the entire LL
1 = file exists, need to initialized with the numbers
inside the file
*/
char filename[] = "C:\\Users\\parkk\\Project Euler\\Problem0003a\\SofE.txt";
FILE *ifp, *ofp;
ifp = fopen (filename, "r");
LL linkList1;
linkList1.head = NULL;
// File does not exist
if (ifp == NULL){
fprintf(stderr, "Can't open input file!\n");
fclose(ifp);
status = 0;
} else {
status = 1;
fclose(ifp);
}
ofp = fopen (filename, "w+");
if (ofp != NULL){
lastNumberProcessedBefore = initSofE(ofp, status, number, &linkList1);
if (!lastNumberProcessedBefore){
fprintf(stderr, "Error in filling linked list!\n");
return -1;
}
} else {
fprintf(stderr, "Error in opening file in a+ mode\n");
// No returning -1 since it just means the file does not exist
}
// Now that sieve has been initialized, the sieve must be processed
printNumberOfNodes(&linkList1);
if (!processSofE(&linkList1, number)){
fprintf(stderr, "Error in processing S of E\n");
return -1;
}
printToFile(&linkList1, ofp, number, lastNumberProcessedBefore);
printNumberOfNodes(&linkList1);
printf("%d\n", getGreatestPrimeFactor(&linkList1, origNumber));
fclose(ofp);
return 0;
}
void printToFile(LL *linkList, FILE *fp, __int64 number, __int64 lastNumber){
int status = 0;
linkList->curr = linkList->head;
while (linkList -> curr != NULL){
if (linkList->curr->value > lastNumber){
fprintf(fp, " %d", linkList->curr->value);
status = 1;
}
linkList->curr = linkList->curr->next;
}
if (status)
fprintf(fp, "%d\n", number);
}
__int64 getGreatestPrimeFactor(LL *linkList, __int64 number){
__int64 greatestPrimeFactor = 0;
Node *p;
p = linkList->head;
if (p == NULL){
fprintf(stderr, "S of E is empty.");
return -1;
}
while (p != NULL){
if (number % p->value == 0){
greatestPrimeFactor = p->value;
}
p = p->next;
}
printNumberOfNodes(linkList);
return greatestPrimeFactor;
}
int processSofE(LL *linkList, __int64 num){
__int64 div = 0, counter = 0;
Node *a;
if (linkList -> head == NULL){
fprintf(stderr, "No node in linkList [processSofE()]\n");
return 0;
}
if (linkList->head->next == NULL){
fprintf(stderr, "Only one node in linkList [processSofE()]\n");
return 0;
}
a = linkList->head->next;
if (a->next == NULL){
fprintf(stderr, "Only two nodes in linkList [processSofE()]\n");
return 0;
}
linkList->prev = a;
linkList->curr= a->next;
while (a->next != NULL){
div = a -> value;
linkList -> prev = a;
linkList -> curr = a -> next;
while (linkList -> curr != NULL && linkList->curr->value < num / 2 + 1){
if (linkList->curr->value % div == 0){
linkList->prev->next = linkList->curr->next;
free(linkList->curr);
linkList->curr = linkList->prev->next;
}
if (linkList -> curr != NULL){
linkList->prev = linkList->curr;
linkList->curr = linkList->curr->next;
}
}
a = a -> next;
}
return 1;
}
__int64 initSofE(FILE *fp, int status, __int64 number, LL *linkList1){
Node *newNode = (Node*)malloc(sizeof(Node));
if (newNode == NULL){
fprintf(stderr, "Error in allocating for the first node");
return 0;
}
newNode -> value = 2;
newNode -> next = linkList1->head;
linkList1->head = newNode;
linkList1->curr = newNode, linkList1->prev = newNode;
__int64 i;
char c, *s;
__int64 prime_num;
// File does not exist
if (status == 0){
// Create a linked list of possible prime number candidates
for (i = 3; i <= number; i+=2){
insertNode(linkList1, i);
if (linkList1->curr == NULL){
fprintf(stderr, "Error in insertNode()\n");
return 0;
}
}
// File exists
} else if (status == 1){
s = (char*)malloc(sizeof(char) * NUM_LENGTH);
if (s == NULL){
fprintf(stderr, "Error in allocating memory for file reading (*s)");
return 0;
}
// Initialize the string
for (i = 0; i < NUM_LENGTH; i++){
s[i] = '\0';
}
i = 0;
//Read the file
do{
c = fgetc(fp);
if (c == ' ' || c == EOF){
prime_num = strtoull(s, &s, 10); // Convert string to long long int
i = 0;
if (prime_num < number){
if (!insertNode(linkList1, prime_num))
return 0;
}
}else{
s[i++] = c;
}
}while (c != EOF);
deleteNode(linkList1); // The last number shows until what number has been checked for prime numbers
if (number > prime_num){
__int64 a = getNextOddNumber(prime_num);
while (a <= number){
if (!insertNode(linkList1, a)){
return 0;
}
a+=2;
}
}
free(s);
}
return 1;
}
__int64 getNextOddNumber(__int64 n){
if (n % 2 == 0){
return n+ 1;
} return n;
}
void printNumberOfNodes(LL *linkList1){
__int64 counter = 0;
Node *c = linkList1->head;
while (c != NULL){
//printf("%llu\n", c -> value);
counter++;
c = c -> next;
}
printf("counter: %d\n", counter);
}
linkedList.c
#include "linkedList.h"
// Adds newNode with value val to the node after which curr is pointing to
// Precondition: the linked list will always contain at least one node
int insertNode(LL *linkList1, __int64 val){
Node *newNode = (Node*)malloc(sizeof(Node));
if (newNode == NULL){
printf("Unable to allocate memory to newNode in insertNode()");
return 0;
}
if (linkList1 -> curr == NULL){
linkList1 -> curr = linkList1 -> head;
while (linkList1->curr->next != NULL)
linkList1->curr = linkList1->curr->next;
}
newNode -> value = val;
newNode -> next = linkList1 -> curr -> next;
linkList1 -> curr -> next = newNode;
linkList1 -> prev = linkList1->curr;
linkList1 -> curr = newNode;
return 1;
}
// Deletes a node where curr is pointing to
// Precondition: first node is not deleted
// curr is always pointing at a node in the linked list
int deleteNode(LL *linkList1){
if (linkList1 -> head == NULL)
return 0;
linkList1->prev->next = linkList1->curr->next;
free(linkList1->curr);
linkList1->curr=linkList1->prev->next;
return 1;
}
Despite all the loopholes, the reason behind the incorrect answer was surprisingly due to integer arithmetics of __int64 types. The answer the code spat out was 486847 which is much larger than the correct answer.
This was because 600851475143 % 486847 = 0 instead of 20.
This miscalculation has caused 486847 to be the largest prime factor of 600851475143.
The moment I found out the problem, I dropped everything because I felt I had enough practice on link lists and file handling and it was time to move on.
Loop Holes
the number to create sieve of Eratosthenes up until must be greater than 2 or it will fail the next time it executes while file exists
the code has segmentation faults when the file exists but is empty
after successfully outputting sieve of Eratosthenes to file, when the program is run again, it correctly retrieves the data but outputs the results incorrectly (This has been fixed by opening the file with "w+" mode despite the fact that the file needs to be rewritten from the start, slowing the execution time as a result.)
No comments:
Post a Comment