PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Wednesday, November 2, 2022

[FIXED] how to check if a file exists based on the files in a directory and indicate which are missing?

 November 02, 2022     file, io, python     No comments   

Issue

how do I check which files are missing from the directory based on a txt of the files I should have?

E.G this is the list of files I should have

A
B
C
D
E
F
G
H
I

But in my directory I only have

A.npy
B.npy
C.npy
D.npy

So I want to do a script that can produce a result.txt like this:

A [exists]
B [exists]
C [exists]
D [exists]
E [does not exist]
F [does not exist]
G [does not exist]
H [does not exist]
I [does not exist]

This is the script I have currently but it doesn't seem to work as it registers all files as "does not exist" :(

import os
import copy
import pandas as pd
import shutil
from pathlib import Path



# read training files.txt 
path_to_file = 'xxxxxxxxxxxxxxxxxx/train_files_CS/all_training_CSmaster.txt'
path = 'xxxxxxxxxxx/train_files_CS'

# list of training npy files in directory
lof = []
for (dirpath, dirnames, filenames) in os.walk(path):
  lof.append(filenames)

lof = [x[:len(x) - 4] for x in lof[0] if x[0] == 'P']
#print(lof)

# new file to be written into
f = open('check_training.txt', 'w')

existing_files = 0
missing_files = 0

trfiles = []
with open(path_to_file) as file:
    for line in file:
        #print(line.rstrip())
        trfiles.append(line)
        
for x in trfiles:    
    if x in lof:
        existing_files+=1
        f.write(x)
        f.write("...[exists] \n")
    else:
        missing_files+=1
        f.write(x)
        f.write("  ...[doesn't exist] \n")
            
f.close()

print("\nthe missing files are:", missing_files,"\n")
print("the existing files are:",existing_files,"\n")

Any help is appreciated, thank you! :)


Solution

Your program works for me after fixing the following two issues:

Issue 1

lof = [x[:len(x) - 4] for x in lof[0] if x[0] == 'P']

I don't think you want to only list files that start with the letter 'P'. Perhaps you left this in by mistake after doing some debugging or something. To get all file names remove the if x[0] == 'P' part:

lof = [x[:len(x) - 4] for x in lof[0]]

Issue 2

with open(path_to_file) as file:
    for line in file:
        #print(line.rstrip())
        trfiles.append(line)

This doesn't remove the line break characters, so you end up with ['a\n', b\n', etc.]` whose elements don't match in the comparisons in the next step. Use this:

with open(path_to_file) as file:
    trfiles = file.read().splitlines()

With these two changes you should find you get the expected output.

Other tips

There are quite a few places where you can make your code more concise and readable by using list comprehensions instead of for loops. E.g.

lof = []
for (dirpath, dirnames, filenames) in os.walk(path):
     lof.append(filenames)

Can be:

lof = [filenames for (dirpath, dirnames, filenames) in os.walk(path)]

Also, x[:len(x) - 4] is not very robust for removing the extension from filenames (as you can have files with 4 letters like .html, .docx, etc.). Use the os library function for splitting extensions:

lof = [os.path.splitext(x)[0] for x in lof[0]]


Answered By - ljdyer
Answer Checked By - Timothy Miller (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing